SpiceeChat

FirstName Gender Classifier β€” 30M

Lightweight, fast, and accurate β€” because guessing isn't a strategy.

SpiceeChat License Params Accuracy


Overview

This model is a fine-tuned version of a custom 20M-parameter CausalLM architecture, originally built by PhysiQuanty. It was trained on a combination of:

  • 150,000 samples from the SpiceeChat/Genre-Classifier dataset
  • 922 hand-curated examples to improve coverage and diversity

The result is a compact, production-ready classifier that predicts gender from a first name with ~85% accuracy and no unnecessary overhead.


Quick Start

from transformers import AutoModel, AutoTokenizer

model = AutoModel.from_pretrained(
    "SpiceeChat/FirstName-Genre-Classifier-30M-SFT",
    trust_remote_code=True   # custom architecture, audited and safe
)
tokenizer = AutoTokenizer.from_pretrained(
    "SpiceeChat/FirstName-Genre-Classifier-30M-SFT",
    trust_remote_code=True
)

name = "Arjun"
inputs = tokenizer(name, return_tensors="pt")
pred, probs = model.predict_gender(inputs.input_ids)
gender = "M" if pred.item() == 1 else "F"
print(f"{name} β†’ {gender} (confidence: {probs.max().item():.2f})")

Expected output:

Arjun β†’ M (confidence: 0.98)

Performance

Metric Value
Validation Accuracy 84.74%
Macro F1 81.06%
Parameters ~20M
Model Size 129 MB

Trained for 3 epochs with class weighting (F : M = 3:1) to handle the natural imbalance in the training data. Loss dropped cleanly from 0.41 to 0.34 across training β€” stable convergence, no overfitting.


What Makes This Model Different

  • Handles global names β€” from Wei (Chinese) to Haruto (Japanese) to Ama (Ghanaian)
  • Generalizes beyond dictionaries β€” learns naming patterns rather than relying on lookup tables
  • Custom lightweight architecture β€” small enough to run comfortably on CPU
  • Fully compatible with Hugging Face Transformers β€” loads like any standard model

Training Details

Detail Value
Base model SpiceeChat/Genre-Classifier-1-20M-BASE-BF16
Training data 150,000 + 922 custom examples
Optimizer AdamW (LR = 2e-5)
Batch size 64 (train) / 256 (eval)
Hardware Tesla T4 (FP16)

Notes

  • The model uses weight tying between head.weight and tok_emb.weight. A harmless head.weight | MISSING warning may appear on load β€” this is expected behavior.
  • trust_remote_code=True is required because the architecture is custom. The modeling code is included in this repository and fully auditable.

Try It Yourself

python -c "
from transformers import AutoModel, AutoTokenizer
model = AutoModel.from_pretrained('SpiceeChat/FirstName-Genre-Classifier-30M-SFT', trust_remote_code=True)
tokenizer = AutoTokenizer.from_pretrained('SpiceeChat/FirstName-Genre-Classifier-30M-SFT', trust_remote_code=True)
name = input('Enter a first name: ')
inputs = tokenizer(name, return_tensors='pt')
pred, _ = model.predict_gender(inputs.input_ids)
print('M' if pred.item() == 1 else 'F')
"

License

Released under the Apache 2.0 license. Use it, modify it, ship it β€” no strings attached.


Built with a lot of caffeine β˜• by SpiceeChat

Built by PhysiQuanty(Did the most work) and QuantaSparkLabs.

Downloads last month
25
Safetensors
Model size
32.3M params
Tensor type
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for SpiceeChat/FirstName-Genre-Classifier-30M-SFT

Finetuned
(1)
this model

Dataset used to train SpiceeChat/FirstName-Genre-Classifier-30M-SFT