3 18

Salman Rahman PRO

salmannyu

https://salmanrahman.net/

AI & ML interests

Natural Language Processing, Deep Learning, Scalable Oversight, and Language Model Evaluation

Recent Activity

upvoted a paper about 16 hours ago

On the Interplay of Pre-Training, Mid-Training, and RL on Reasoning Language Models

upvoted a paper about 20 hours ago

SPARK: Stepwise Process-Aware Rewards for Reference-Free Reinforcement Learning

submitted a paper about 20 hours ago

SPARK: Stepwise Process-Aware Rewards for Reference-Free Reinforcement Learning

View all activity

Organizations

upvoted a paper about 16 hours ago

On the Interplay of Pre-Training, Mid-Training, and RL on Reasoning Language Models

Paper • 2512.07783 • Published 2 days ago • 23

upvoted a paper about 20 hours ago

SPARK: Stepwise Process-Aware Rewards for Reference-Free Reinforcement Learning

Paper • 2512.03244 • Published 8 days ago • 13

upvoted a paper 2 months ago

The African Languages Lab: A Collaborative Approach to Advancing Low-Resource African NLP

Paper • 2510.05644 • Published Oct 7 • 23

upvoted a paper 5 months ago

A Survey of Context Engineering for Large Language Models

Paper • 2507.13334 • Published Jul 17 • 259

upvoted 3 papers 6 months ago

Revisiting Reinforcement Learning for LLM Reasoning from A Cross-Domain Perspective

Paper • 2506.14965 • Published Jun 17 • 49

Embodied Web Agents: Bridging Physical-Digital Realms for Integrated Agent Intelligence

Paper • 2506.15677 • Published Jun 18 • 23

Xolver: Multi-Agent Reasoning with Holistic Experience Learning Just Like an Olympiad Team

Paper • 2506.14234 • Published Jun 17 • 41

upvoted 3 papers 8 months ago

X-Teaming: Multi-Turn Jailbreaks and Defenses with Adaptive Multi-Agents

Paper • 2504.13203 • Published Apr 15 • 35

MOSAIC: Modeling Social AI for Content Dissemination and Regulation in Multi-Agent Simulations

Paper • 2504.07830 • Published Apr 10 • 18

Advances and Challenges in Foundation Agents: From Brain-Inspired Intelligence to Evolutionary, Collaborative, and Safe Systems

Paper • 2504.01990 • Published Mar 31 • 300

upvoted an article 9 months ago

Article

Fine-tune Llama 3.1 Ultra-Efficiently with Unsloth

Jul 29, 2024

•

364

upvoted 2 papers 10 months ago

CODESIM: Multi-Agent Code Generation and Problem Solving through Simulation-Driven Planning and Debugging

Paper • 2502.05664 • Published Feb 8 • 24

SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model

Paper • 2502.02737 • Published Feb 4 • 249

upvoted a paper about 1 year ago

Adapting While Learning: Grounding LLMs for Scientific Problems with Intelligent Tool Usage Adaptation

Paper • 2411.00412 • Published Nov 1, 2024 • 10

upvoted 4 papers over 1 year ago

MathVerse: Does Your Multi-modal LLM Truly See the Diagrams in Visual Math Problems?

Paper • 2403.14624 • Published Mar 21, 2024 • 53

Salman Rahman PRO

AI & ML interests

Recent Activity

Organizations

salmannyu's activity

Fine-tune Llama 3.1 Ultra-Efficiently with Unsloth