2 16

Jiaxin Huang

teapot123

AI & ML interests

None yet

Recent Activity

upvoted a paper 11 days ago

You Only Need Minimal RLVR Training: Extrapolating LLMs via Rank-1 Trajectories

upvoted a paper 12 days ago

Process Rewards with Learned Reliability

authored a paper 18 days ago

Generating Training Data with Language Models: Towards Zero-Shot Language Understanding

View all activity

Organizations

upvoted a paper 11 days ago

You Only Need Minimal RLVR Training: Extrapolating LLMs via Rank-1 Trajectories

Paper • 2605.21468 • Published 12 days ago • 49

upvoted a paper 12 days ago

Process Rewards with Learned Reliability

Paper • 2605.15529 • Published 17 days ago • 53

upvoted a paper 20 days ago

G-Zero: Self-Play for Open-Ended Generation from Zero Data

Paper • 2605.09959 • Published 21 days ago • 17

upvoted a paper 24 days ago

Nonsense Helps: Prompt Space Perturbation Broadens Reasoning Exploration

Paper • 2605.05566 • Published 25 days ago • 37

upvoted 3 papers 4 months ago

upvoted a paper 5 months ago

RelayLLM: Efficient Reasoning via Collaborative Decoding

Paper • 2601.05167 • Published Jan 8 • 31

upvoted a paper 6 months ago

VisPlay: Self-Evolving Vision-Language Models from Images

Paper • 2511.15661 • Published Nov 19, 2025 • 45

upvoted a paper 9 months ago

Self-Rewarding Vision-Language Model via Reasoning Decomposition

Paper • 2508.19652 • Published Aug 27, 2025 • 85

upvoted a paper 10 months ago

R-Zero: Self-Evolving Reasoning LLM from Zero Data

Paper • 2508.05004 • Published Aug 7, 2025 • 132

upvoted a paper 12 months ago

POSS: Position Specialist Generates Better Draft for Speculative Decoding

Paper • 2506.03566 • Published Jun 4, 2025 • 6

upvoted 2 papers about 1 year ago

CrossWordBench: Evaluating the Reasoning Capabilities of LLMs and LVLMs with Controllable Puzzle Generation

Paper • 2504.00043 • Published Mar 30, 2025 • 10

Efficient Test-Time Scaling via Self-Calibration

Paper • 2503.00031 • Published Feb 25, 2025 • 15

upvoted 2 papers over 1 year ago

Towards Large Reasoning Models: A Survey of Reinforced Reasoning with Large Language Models

Paper • 2501.09686 • Published Jan 23, 2025 • 41

Taming Overconfidence in LLMs: Reward Calibration in RLHF

Paper • 2410.09724 • Published Oct 13, 2024 • 3

Jiaxin Huang

AI & ML interests

Recent Activity

Organizations

teapot123's activity