·
AI & ML interests
Reinforcement Learning
Organizations
luckeciano/pku-llama3.1-8b-dataset-test-generations
Viewer
•
Updated
•
4.7M
•
10
luckeciano/pku-llama3.1-8b-dataset-train-generations
Viewer
•
Updated
•
1.36M
luckeciano/pku-alpaca3.1-8b-eval-gt-rewards
Viewer
•
Updated
•
4.7k
•
6
luckeciano/pku-alpaca3.1-8b-gt-rewards
Viewer
•
Updated
•
6.05M
•
1
luckeciano/pku-llama3.1-8b-answers-features-test
Viewer
•
Updated
•
4.42M
•
12
luckeciano/pku-llama3.1-8b-answers-features-train
Viewer
•
Updated
•
1.28M
•
31
luckeciano/pku-llama3.1-8b-dataset-features-gt-reward-modeling
luckeciano/pku-llama3.1-8b-dataset-features
Viewer
•
Updated
•
18.3k
•
49
luckeciano/PKU-SafeRLHF-Shifts
Viewer
•
Updated
•
18.3k
•
5
luckeciano/mistral8x22b-reddit-post-features
Viewer
•
Updated
•
92.9k
•
103
luckeciano/llama370b-reddit-post-features
Viewer
•
Updated
•
82.5k
•
12
luckeciano/llama370b-features-reddit
Viewer
•
Updated
•
150k
•
53
luckeciano/mistral8x22b-features-reddit
Viewer
•
Updated
•
166k
•
46
luckeciano/hermes-reddit-post-features
Viewer
•
Updated
•
92.7k
•
32
luckeciano/llama27b-features-reddit
Viewer
•
Updated
•
189k
•
30
luckeciano/falcon7b-features-reddit
Viewer
•
Updated
•
159k
•
17
luckeciano/hermes-features-ultrafeedback
Viewer
•
Updated
•
63.8k
•
49
luckeciano/reddit-features-hermes
Viewer
•
Updated
•
169k
•
36
luckeciano/learning-to-summarize
Viewer
•
Updated
•
426k
•
84
•
1