Diverse Think - LoRA Fine-tuned Qwen2.5-Coder-1.5B

This is the best performing checkpoint from the diverse_think training configuration.

Model Details

Property	Value
Base Model	Qwen/Qwen2.5-Coder-1.5B-Instruct
Training Dataset	Naholav/CodeGen-Diverse-5K
Training Method	LoRA (Low-Rank Adaptation)
Checkpoint	step-800, epoch-3
Pass@1 (AtCoder Easy)	29.27% (12/41 problems)

Training Configuration

Prompt Style: Think (uses <think> tags for reasoning)
System Prompt: "You are an expert programmer. Use tags for reasoning before writing code."
LoRA Rank: 32
LoRA Alpha: 64
LoRA Dropout: 0.05
Learning Rate: 5e-5

Note: All 4 models were trained with identical hyperparameters for fair comparison. Better configurations may be discovered through hyperparameter search methods (e.g., grid search, random search).

All Models Performance Comparison

Evaluated on LiveCodeBench AtCoder Easy problems (41 questions):

Model	Pass@1	Improvement
Base Model (Qwen2.5-Coder-1.5B)	24.39%	-
deep-instruction	26.83%	+10%
diverse-think (this model)	29.27%	+20%
deep-think	31.71%	+30%
diverse-instruction	31.71%	+30%

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch

# Load base model
base_model = AutoModelForCausalLM.from_pretrained(
    "Qwen/Qwen2.5-Coder-1.5B-Instruct",
    torch_dtype=torch.bfloat16,
    device_map="auto"
)

# Load LoRA adapter
model = PeftModel.from_pretrained(base_model, "Naholav/diverse-think")

# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2.5-Coder-1.5B-Instruct")

# Generate with think prompt
messages = [
    {"role": "system", "content": "You are an expert programmer. Use <think> tags for reasoning before writing code."},
    {"role": "user", "content": "Your problem here..."}
]

prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=2048)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Resources

GitHub Repository: https://github.com/naholav/CodeGen
Training Dataset: Naholav/CodeGen-Diverse-5K

Citation

If you use this model, please cite:

@misc{naholav2024codegen,
  author = {naholav},
  title = {CodeGen: LoRA Fine-tuning for Competitive Programming},
  year = {2025},
  publisher = {HuggingFace},
  url = {https://huggingface.co/Naholav/diverse-think}
}

Downloads last month: -; Downloads are not tracked for this model. How to track

Model tree for Naholav/diverse-think

Base model

Qwen/Qwen2.5-1.5B

Finetuned

Qwen/Qwen2.5-Coder-1.5B

Finetuned

Qwen/Qwen2.5-Coder-1.5B-Instruct

Adapter

(55)

this model

Dataset used to train Naholav/diverse-think

Collection including Naholav/diverse-think

Competitive Code Reasoning

Collection

Competitive programming with reasoning traces. • 6 items • Updated 1 day ago