sxcasf (ads)

6 8 5

sxcasf

AI & ML interests

None yet

Recent Activity

upvoted a paper 1 day ago

Lightning OPD: Efficient Post-Training for Large Reasoning Models with Offline On-Policy Distillation

commentedon a paper 1 day ago

Lightning OPD: Efficient Post-Training for Large Reasoning Models with Offline On-Policy Distillation

upvoted a paper 16 days ago

A Survey of On-Policy Distillation for Large Language Models

View all activity

Organizations

upvoted a paper 1 day ago

Lightning OPD: Efficient Post-Training for Large Reasoning Models with Offline On-Policy Distillation

Paper • 2604.13010 • Published 4 days ago • 10

commented a paper 1 day ago

Lightning OPD: Efficient Post-Training for Large Reasoning Models with Offline On-Policy Distillation

Paper • 2604.13010 • Published 4 days ago • 10 •

upvoted a paper 16 days ago

A Survey of On-Policy Distillation for Large Language Models

Paper • 2604.00626 • Published 17 days ago • 9

upvoted a paper 25 days ago

LongCat-Flash-Prover: Advancing Native Formal Reasoning via Agentic Tool-Integrated Reinforcement Learning

Paper • 2603.21065 • Published 27 days ago • 77

liked a dataset about 2 months ago

AudioVisual-Caption/ASID-1M

Viewer • Updated Mar 11 • 241k • 4.34k • 82

liked 2 models 3 months ago

Alibaba-Apsara/DASD-30B-A3B-Thinking-Preview

Text Generation • Updated Jan 15 • 127 • 52

Alibaba-Apsara/DASD-4B-Thinking

Text Generation • Updated Jan 15 • 429 • 217

upvoted a collection 3 months ago

DASD-Thinking

Collection

6 items • Updated Feb 3 • 25

liked a dataset 3 months ago

Alibaba-Apsara/Superior-Reasoning-SFT-gpt-oss-120b

Viewer • Updated Jan 31 • 306k • 2.54k • 320

upvoted a paper 3 months ago

Distribution-Aligned Sequence Distillation for Superior Long-CoT Reasoning

Paper • 2601.09088 • Published Jan 14 • 63

liked a dataset 3 months ago

Alibaba-Apsara/Superior-Reasoning-SFT-gpt-oss-120b-Logprob

Viewer • Updated Jan 15 • 435k • 786 • 58

updated 2 datasets 3 months ago

Alibaba-Apsara/Superior-Reasoning-SFT-gpt-oss-120b

Viewer • Updated Jan 31 • 306k • 2.54k • 320

Alibaba-Apsara/Superior-Reasoning-SFT-gpt-oss-120b-Logprob

Viewer • Updated Jan 15 • 435k • 786 • 58

updated 2 models 3 months ago

Alibaba-Apsara/DASD-30B-A3B-Thinking-Preview

Text Generation • Updated Jan 15 • 127 • 52

Alibaba-Apsara/DASD-4B-Thinking

Text Generation • Updated Jan 15 • 429 • 217

upvoted a paper 4 months ago

Where Did This Sentence Come From? Tracing Provenance in LLM Reasoning Distillation

Paper • 2512.20908 • Published Dec 24, 2025 • 29

New activity in HuggingFaceTB/Countdown-Task-GOLD 4 months ago

Inconsistent numbers

#1 opened 5 months ago by

MysticJay

upvoted a paper 4 months ago

Unified Video Editing with Temporal Reasoner

Paper • 2512.07469 • Published Dec 8, 2025 • 46

New activity in Qwen/Qwen3-1.7B 5 months ago

When enable_thinking=True, why doesn't the chat_template output end with "<think>？

#16 opened 5 months ago by

sxcasf

New activity in VityaVitalich/Qwen3-1.7B 5 months ago

When enable_thinking=True, why doesn't the chat_template output end with "<think>

#1 opened 5 months ago by

sxcasf

ads

AI & ML interests

Recent Activity

Organizations

sxcasf's activity

Inconsistent numbers

When enable_thinking=True, why doesn't the chat_template output end with "<think>？

When enable_thinking=True, why doesn't the chat_template output end with "<think>