SpiceeChat

FirstName Gender Classifier — 30M

Lightweight, fast, and accurate — because guessing isn't a strategy.

Overview

This model is a fine-tuned version of a custom 20M-parameter CausalLM architecture, originally built by PhysiQuanty. It was trained on a combination of:

150,000 samples from the SpiceeChat/Genre-Classifier dataset
922 hand-curated examples to improve coverage and diversity

The result is a compact, production-ready classifier that predicts gender from a first name with ~85% accuracy and no unnecessary overhead.

Quick Start

from transformers import AutoModel, AutoTokenizer

model = AutoModel.from_pretrained(
    "SpiceeChat/FirstName-Genre-Classifier-30M-SFT",
    trust_remote_code=True   # custom architecture, audited and safe
)
tokenizer = AutoTokenizer.from_pretrained(
    "SpiceeChat/FirstName-Genre-Classifier-30M-SFT",
    trust_remote_code=True
)

name = "Arjun"
inputs = tokenizer(name, return_tensors="pt")
pred, probs = model.predict_gender(inputs.input_ids)
gender = "M" if pred.item() == 1 else "F"
print(f"{name} → {gender} (confidence: {probs.max().item():.2f})")

Expected output:

Arjun → M (confidence: 0.98)

Performance

Metric	Value
Validation Accuracy	84.74%
Macro F1	81.06%
Parameters	~20M
Model Size	129 MB

Trained for 3 epochs with class weighting (F : M = 3:1) to handle the natural imbalance in the training data. Loss dropped cleanly from 0.41 to 0.34 across training — stable convergence, no overfitting.

What Makes This Model Different

Handles global names — from Wei (Chinese) to Haruto (Japanese) to Ama (Ghanaian)
Generalizes beyond dictionaries — learns naming patterns rather than relying on lookup tables
Custom lightweight architecture — small enough to run comfortably on CPU
Fully compatible with Hugging Face Transformers — loads like any standard model

Training Details

Detail	Value
Base model	`SpiceeChat/Genre-Classifier-1-20M-BASE-BF16`
Training data	150,000 + 922 custom examples
Optimizer	AdamW (LR = 2e-5)
Batch size	64 (train) / 256 (eval)
Hardware	Tesla T4 (FP16)

Notes

The model uses weight tying between head.weight and tok_emb.weight. A harmless head.weight | MISSING warning may appear on load — this is expected behavior.
trust_remote_code=True is required because the architecture is custom. The modeling code is included in this repository and fully auditable.

Try It Yourself

python -c "
from transformers import AutoModel, AutoTokenizer
model = AutoModel.from_pretrained('SpiceeChat/FirstName-Genre-Classifier-30M-SFT', trust_remote_code=True)
tokenizer = AutoTokenizer.from_pretrained('SpiceeChat/FirstName-Genre-Classifier-30M-SFT', trust_remote_code=True)
name = input('Enter a first name: ')
inputs = tokenizer(name, return_tensors='pt')
pred, _ = model.predict_gender(inputs.input_ids)
print('M' if pred.item() == 1 else 'F')
"

License

Released under the Apache 2.0 license. Use it, modify it, ship it — no strings attached.

_{Built with a lot of caffeine ☕ by SpiceeChat}

Built by PhysiQuanty(Did the most work) and QuantaSparkLabs.

Downloads last month: 25

Safetensors

Model size

32.3M params

Tensor type

F32

Model tree for SpiceeChat/FirstName-Genre-Classifier-30M-SFT

Base model

SpiceeChat/Genre-Classifier-1-20M-BASE-BF16

Finetuned

(1)

this model

SpiceeChat
/

FirstName-Genre-Classifier-30M-SFT