Lora & full finetune experiments on r1 distills to generate python code for math problems
Ram
0-hero
AI & ML interests
All work on this profile is personal
Organizations
models
48
0-hero/r1-7B-grpo-v3.3-epoch-3
8B
•
Updated
•
6
0-hero/r1-7B-grpo-v3.3-epoch-2
8B
•
Updated
•
7
0-hero/r1-7B-grpo-v3.3-epoch-1
8B
•
Updated
•
6
0-hero/r1-7B-grpo-v3.2-epoch-2
8B
•
Updated
•
8
0-hero/r1-7B-grpo-v3.2-epoch-1
8B
•
Updated
•
7
0-hero/r1-14B-grpo-v3.1-epoch-2
15B
•
Updated
•
9
0-hero/r1-14B-grpo-v3.1-epoch-1
15B
•
Updated
•
9
0-hero/r1-7B-grpo-v3.1-epoch-3
8B
•
Updated
•
14
0-hero/r1-7B-grpo-v3.1-epoch-2
8B
•
Updated
•
8
0-hero/r1-7B-grpo-v2-temp-1.0-60
8B
•
Updated
•
10
datasets
14
0-hero/MATH
Viewer
•
Updated
•
331k
•
33
•
1
0-hero/audio-samples-fixed
Viewer
•
Updated
•
10
•
414
0-hero/distilabel-math-preference-dpo
Viewer
•
Updated
•
2.42k
•
46
0-hero/lj_speech_with_spectogram_conversations
Viewer
•
Updated
•
13.1k
•
28
•
1
0-hero/lj_speech_with_spectogram
Viewer
•
Updated
•
13.1k
•
51
•
1
0-hero/Matter-0.2-alpha
Viewer
•
Updated
•
2.52M
•
45
•
3
0-hero/Matter-0.1
Viewer
•
Updated
•
2.25M
•
146
•
53
0-hero/Matter-0.1-Slim-D
Viewer
•
Updated
•
1.32M
•
63
0-hero/Matter-0.1-Slim-C
Viewer
•
Updated
•
343k
•
41
0-hero/Matter-0.1-Slim-B
Viewer
•
Updated
•
308k
•
70
•
1