Shawn's picture

9 2

Shawn

csfufu

·

AI & ML interests

None yet

Recent Activity

upvoted a paper 20 days ago

WebWatcher: Breaking New Frontier of Vision-Language Deep Research Agent

upvoted a paper 24 days ago

OneThinker: All-in-one Reasoning Model for Image and Video

authored a paper 3 months ago

ARES: Multimodal Adaptive Reasoning via Difficulty-Aware Token-Level Entropy Shaping

View all activity

Organizations

upvoted a paper 20 days ago

WebWatcher: Breaking New Frontier of Vision-Language Deep Research Agent

Paper • 2508.05748 • Published Aug 7 • 141

upvoted a paper 24 days ago

OneThinker: All-in-one Reasoning Model for Image and Video

Paper • 2512.03043 • Published 26 days ago • 32

authored a paper 3 months ago

ARES: Multimodal Adaptive Reasoning via Difficulty-Aware Token-Level Entropy Shaping

Paper • 2510.08457 • Published Oct 9 • 12

upvoted a paper 3 months ago

ARES: Multimodal Adaptive Reasoning via Difficulty-Aware Token-Level Entropy Shaping

Paper • 2510.08457 • Published Oct 9 • 12

upvoted a collection 3 months ago

ARES

🌴ARES is an open-source framework for adaptive multimodal reasoning, using difficulty-aware training and entropy-shaped policy optimization. • 5 items • Updated Oct 12 • 2

upvoted 3 papers 4 months ago

Emergent Hierarchical Reasoning in LLMs through Reinforcement Learning

Paper • 2509.03646 • Published Sep 3 • 32

Reconstruction Alignment Improves Unified Multimodal Models

Paper • 2509.07295 • Published Sep 8 • 40

Interleaving Reasoning for Better Text-to-Image Generation

Paper • 2509.06945 • Published Sep 8 • 14

liked a model 5 months ago

rednote-hilab/dots.vlm1.inst

Image-Text-to-Text • 672B • Updated Aug 21 • 2.61k • 80

liked a model 6 months ago

csfufu/Revisual-R1-final

Image-Text-to-Text • 8B • Updated Jul 14 • 28 • 8

updated 2 datasets 6 months ago

csfufu/Grammer_dataset

Viewer • Updated Jul 14 • 47.4k • 78 • 2

csfufu/textrl

Viewer • Updated Jul 14 • 32.5k • 78 • 1

updated a model 6 months ago

csfufu/Revisual-R1-final

Image-Text-to-Text • 8B • Updated Jul 14 • 28 • 8

updated a collection 6 months ago

Revisual-R1

🚀ReVisual-R1 is a 7B open-source multimodal language model that follows a three-stage curriculum—cold-start pre-training, multimodal reinforcement. • 6 items • Updated Oct 9 • 3

published a dataset 6 months ago

csfufu/Grammer_dataset

Viewer • Updated Jul 14 • 47.4k • 78 • 2

updated a collection 6 months ago

Revisual-R1

🚀ReVisual-R1 is a 7B open-source multimodal language model that follows a three-stage curriculum—cold-start pre-training, multimodal reinforcement. • 6 items • Updated Oct 9 • 3

updated a dataset 6 months ago

csfufu/mmrl

Viewer • Updated Jun 25 • 30.9k • 55 • 1

updated a model 6 months ago

csfufu/Revisual-R1-Coldstart

Image-Text-to-Text • 8B • Updated Jun 25 • 118 • 5

updated a collection 6 months ago

Revisual-R1

🚀ReVisual-R1 is a 7B open-source multimodal language model that follows a three-stage curriculum—cold-start pre-training, multimodal reinforcement. • 6 items • Updated Oct 9 • 3

updated a collection 7 months ago

Revisual-R1

🚀ReVisual-R1 is a 7B open-source multimodal language model that follows a three-stage curriculum—cold-start pre-training, multimodal reinforcement. • 6 items • Updated Oct 9 • 3