Competitive Code Reasoning
Collection
Competitive programming with reasoning traces.
•
6 items
•
Updated
This is the best performing checkpoint from the diverse_think training configuration.
| Property | Value |
|---|---|
| Base Model | Qwen/Qwen2.5-Coder-1.5B-Instruct |
| Training Dataset | Naholav/CodeGen-Diverse-5K |
| Training Method | LoRA (Low-Rank Adaptation) |
| Checkpoint | step-800, epoch-3 |
| Pass@1 (AtCoder Easy) | 29.27% (12/41 problems) |
<think> tags for reasoning)Note: All 4 models were trained with identical hyperparameters for fair comparison. Better configurations may be discovered through hyperparameter search methods (e.g., grid search, random search).
Evaluated on LiveCodeBench AtCoder Easy problems (41 questions):
| Model | Pass@1 | Improvement |
|---|---|---|
| Base Model (Qwen2.5-Coder-1.5B) | 24.39% | - |
| deep-instruction | 26.83% | +10% |
| diverse-think (this model) | 29.27% | +20% |
| deep-think | 31.71% | +30% |
| diverse-instruction | 31.71% | +30% |
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch
# Load base model
base_model = AutoModelForCausalLM.from_pretrained(
"Qwen/Qwen2.5-Coder-1.5B-Instruct",
torch_dtype=torch.bfloat16,
device_map="auto"
)
# Load LoRA adapter
model = PeftModel.from_pretrained(base_model, "Naholav/diverse-think")
# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2.5-Coder-1.5B-Instruct")
# Generate with think prompt
messages = [
{"role": "system", "content": "You are an expert programmer. Use <think> tags for reasoning before writing code."},
{"role": "user", "content": "Your problem here..."}
]
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=2048)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
If you use this model, please cite:
@misc{naholav2024codegen,
author = {naholav},
title = {CodeGen: LoRA Fine-tuning for Competitive Programming},
year = {2025},
publisher = {HuggingFace},
url = {https://huggingface.co/Naholav/diverse-think}
}
Base model
Qwen/Qwen2.5-1.5B