dmitry315 commited on
Commit
a439b13
·
verified ·
1 Parent(s): 8e78968

End of training

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -2,7 +2,7 @@
2
  base_model: HuggingFaceTB/SmolLM-135M-Instruct
3
  datasets: HumanLLMs/Human-Like-DPO-Dataset
4
  library_name: transformers
5
- model_name: trainer_output
6
  tags:
7
  - generated_from_trainer
8
  - trl
@@ -10,7 +10,7 @@ tags:
10
  licence: license
11
  ---
12
 
13
- # Model Card for trainer_output
14
 
15
  This model is a fine-tuned version of [HuggingFaceTB/SmolLM-135M-Instruct](https://huggingface.co/HuggingFaceTB/SmolLM-135M-Instruct) on the [HumanLLMs/Human-Like-DPO-Dataset](https://huggingface.co/datasets/HumanLLMs/Human-Like-DPO-Dataset) dataset.
16
  It has been trained using [TRL](https://github.com/huggingface/trl).
 
2
  base_model: HuggingFaceTB/SmolLM-135M-Instruct
3
  datasets: HumanLLMs/Human-Like-DPO-Dataset
4
  library_name: transformers
5
+ model_name: dmitry315/llm-course-hw2-reward-model
6
  tags:
7
  - generated_from_trainer
8
  - trl
 
10
  licence: license
11
  ---
12
 
13
+ # Model Card for dmitry315/llm-course-hw2-reward-model
14
 
15
  This model is a fine-tuned version of [HuggingFaceTB/SmolLM-135M-Instruct](https://huggingface.co/HuggingFaceTB/SmolLM-135M-Instruct) on the [HumanLLMs/Human-Like-DPO-Dataset](https://huggingface.co/datasets/HumanLLMs/Human-Like-DPO-Dataset) dataset.
16
  It has been trained using [TRL](https://github.com/huggingface/trl).