·
AI & ML interests
None yet
Organizations
shirwu/official-hotpotqa-hotpotqa_four_agents_pipeline-hint_generator-iter0
Updated
shirwu/official-hotpotqa-hotpotqa_four_agents_pipeline-answer_generator-iter0
Updated
shirwu/trainsize200_iter3_rerun-hotpotqa-hotpotqa_two_agents_pipeline-answer_generator-iter0
Updated
shirwu/trainsize200_iter3-hotpotqa-hotpotqa_two_agents_pipeline-answer_generator-iter0
Updated
Text Classification
•
8B
•
Updated
•
8
shirwu/iter_debug-hotpotqa-hotpotqa_two_agents_pipeline-answer_generator-iter0
Updated
shirwu/hotpotqa_two_agents_pipeline-hint_generator-iter2
Updated
shirwu/hotpotqa_two_agents_pipeline-answer_generator-iter2
Updated
shirwu/hotpotqa_two_agents_pipeline-hint_generator-iter1
Updated
shirwu/hotpotqa_two_agents_pipeline-answer_generator-iter1
Updated
shirwu/hotpotqa_two_agents_pipeline-hint_generator-iter0
Updated
shirwu/hotpotqa_two_agents_pipeline-answer_generator-iter0
Updated
shirwu/reward_model_train_final
Updated
shirwu/reward_model_train_debug
1B
•
Updated
•
7
Text Classification
•
8B
•
Updated
•
3
Text Classification
•
1B
•
Updated
•
3
Text Classification
•
1B
•
Updated
•
16
shirwu/rm_final_Llama-3.1-1B-Instruct
Text Classification
•
1B
•
Updated
•
5
shirwu/rm_final_Llama-3.1-8B-Instruct
Text Classification
•
8B
•
Updated
•
7
Text Classification
•
8B
•
Updated
•
6
shirwu/rm_Llama-3.1-8B-Instruct
Updated
shirwu/rm__freezelast_oldtemplate__quant__Llama-3.1-8B-Instruct
Updated
shirwu/rm__lr-2e-6__freezelast_oldtemplate__quant__Llama-3.1-8B-Instruct
Updated
shirwu/rm__lr-1e-5__freezelast_oldtemplate__quant__Llama-3.1-8B-Instruct
Updated
shirwu/rm_freezelast_oldtemplate_quant_Llama-3.1-8B-Instruct
Updated
shirwu/rm_freeze-last_quant_Llama-3.1-8B-Instruct
Updated
shirwu/rm_freeze-last_quant_Skywork-Reward-Llama-3.1-8B-v0.2
Updated
shirwu/rm_unfreeze_last_Llama-3.1-8B-Instruct
Updated