NormCQGen Model - Norwegian MCQ Generator
This is a LoRA adapter fine-tuned on openai/gpt-oss-20b for generating Norwegian multiple-choice questions (MCQ). The model was trained using supervised fine-tuning (SFT) with the TRL library on the normcqgen-thinking dataset.
Model Details
Model Description
This LoRA adapter specializes in generating high-quality Norwegian multiple-choice questions suitable for educational assessments and quiz creation. The adapter applies selective fine-tuning to attention projection layers and Mixture-of-Experts (MoE) layers of the base model.
- Developed by: solarmar
- Model type: LoRA adapter for causal language modeling
- Language: Norwegian (Bokmål)
- License: Apache 2.0 (same as base model)
- Fine-tuned from: openai/gpt-oss-20b
- Adapter size: 143 MB
Model Architecture
LoRA Configuration:
- Rank (r): 16
- Alpha: 16
- Dropout: 0.05 (training), 0.0 (inference)
- Target Modules:
- Attention projections:
q_proj,k_proj,v_proj,o_proj
- Attention projections:
- Target Parameters (MoE expert layers):
- Layer 15:
mlp.experts.gate_up_proj,mlp.experts.down_proj - Layer 19:
mlp.experts.gate_up_proj,mlp.experts.down_proj - Layer 21:
mlp.experts.gate_up_proj,mlp.experts.down_proj - Layer 23:
mlp.experts.gate_up_proj,mlp.experts.down_proj
- Layer 15:
Intended Uses
Direct Use
This model is designed to generate Norwegian multiple-choice questions for:
- Educational assessments
- Quiz creation
- Learning materials development
- Knowledge evaluation tasks
Out-of-Scope Use
- Generating content in languages other than Norwegian
- Non-MCQ text generation tasks
- Critical decision-making without human review
- Tasks requiring up-to-date factual information (model knowledge cutoff applies)
How to Use
Using PEFT (Recommended)
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
# Load base model and tokenizer
base_model = AutoModelForCausalLM.from_pretrained(
"openai/gpt-oss-20b",
device_map="auto",
torch_dtype="auto"
)
tokenizer = AutoTokenizer.from_pretrained("openai/gpt-oss-20b")
# Load LoRA adapter
model = PeftModel.from_pretrained(base_model, "solarmar/normcqgen-model")
# Prepare input
messages = [{"role": "user", "content": "Generer et flervalgsspørsmål om fotosyntese."}]
inputs = tokenizer.apply_chat_template(messages, return_tensors="pt").to(model.device)
# Generate
outputs = model.generate(inputs, max_new_tokens=512, temperature=0.7, do_sample=True)
generated_text = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(generated_text)
Using AutoPeftModel
from transformers import AutoTokenizer
from peft import AutoPeftModelForCausalLM
# Load model with adapter
model = AutoPeftModelForCausalLM.from_pretrained(
"solarmar/normcqgen-model",
device_map="auto",
torch_dtype="auto"
)
tokenizer = AutoTokenizer.from_pretrained("solarmar/normcqgen-model")
# Generate
messages = [{"role": "user", "content": "Lag et spørsmål om norsk historie."}]
inputs = tokenizer.apply_chat_template(messages, return_tensors="pt").to(model.device)
outputs = model.generate(inputs, max_new_tokens=512)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Training Details
Training Data
The model was fine-tuned on solarmar/normcqgen-thinking, a dataset of Norwegian multiple-choice questions with reasoning traces.
Training Procedure
Training Hyperparameters:
- Training method: Supervised Fine-Tuning (SFT)
- Number of epochs: 2
- Per-device train batch size: 32
- Per-device eval batch size: 32
- Gradient accumulation steps: 4
- Effective batch size: 128 (32 × 4)
- Learning rate: 2e-4
- Weight decay: 0.01
- Warmup ratio: 0.03
- Max sequence length: 1024 tokens
- Logging steps: 5
- Evaluation steps: 10
- Save steps: 10
LoRA Hyperparameters:
- LoRA rank (r): 16
- LoRA alpha: 16
- LoRA dropout: 0.05
- Target modules: All attention projections (q, k, v, o)
- Target parameters: MoE expert layers at positions 15, 19, 21, 23
Training Results
| Metric | Value |
|---|---|
| Evaluation Loss | 0.736 |
| Mean Token Accuracy | 83.17% |
| Evaluation Entropy | 0.748 |
| Total Tokens Evaluated | 5,101,372 |
| Training Epochs | 2.0 |
The model achieved strong performance with 83.17% token-level accuracy while maintaining reasonable diversity in predictions (entropy: 0.748).
Framework Versions
- PEFT: 0.18.0
- TRL: 0.25.1
- Transformers: 4.57.1
- PyTorch: 2.5.1+cu121
- Datasets: 4.4.1
- Tokenizers: 0.22.1
Limitations and Bias
- The model is specialized for Norwegian language and may not perform well in other languages
- Performance depends on the quality and coverage of the training dataset
- May reflect biases present in the training data
- Generated questions should be reviewed by educators before use in formal assessments
- The model's knowledge is limited to the base model's training data cutoff
Citation
If you use this model, please cite:
@misc{normcqgen2024,
author = {solarmar},
title = {NormCQGen: Norwegian MCQ Generation Model},
year = {2024},
publisher = {HuggingFace},
howpublished = {\url{https://huggingface.co/solarmar/normcqgen-model}}
}
TRL Citation
This model was trained using TRL (Transformer Reinforcement Learning):
@misc{vonwerra2022trl,
title = {{TRL: Transformer Reinforcement Learning}},
author = {Leandro von Werra and Younes Belkada and Lewis Tunstall and Edward Beeching and Tristan Thrush and Nathan Lambert and Shengyi Huang and Kashif Rasul and Quentin Gallou{\'e}dec},
year = 2020,
journal = {GitHub repository},
publisher = {GitHub},
howpublished = {\url{https://github.com/huggingface/trl}}
}
Acknowledgments
- Base model: openai/gpt-oss-20b
- Training framework: TRL
- PEFT library: Hugging Face PEFT
- Downloads last month
- 11
Model tree for solarmar/normcqgen-model
Base model
openai/gpt-oss-20b