Dominick Wirzba
Chronuid
·
AI & ML interests
None yet
Recent Activity
reacted to sergiopaniego's post with 🔥 21 days ago
Qwen3.5 dense (smol 🤏) models just dropped
- natively multimodal
- 0.8B · 2B · 4B · 9B (+ base variants)
- 262K context extensible to 1M
- built-in thinking
fine-tune them with TRL out of the box → SFT, GRPO, DPO and more!
examples: https://huggingface.co/docs/trl/example_overview
collection: https://huggingface.co/collections/Qwen/qwen35 reacted to sergiopaniego's post with 🔥 21 days ago
did you know you can train agentic models with RL deploying the environments on HF Spaces? 🤗
with TRL + OpenEnv, your training script connects to remote environments hosted as Spaces
want to train faster? → just add more Spaces (TRL handles the parallelization natively)
we used this to train a model to solve the trolley problem in CARLA. 2 HF Spaces running a full driving simulator, each on a T4 GPU
full write-up with code and results → https://huggingface.co/blog/sergiopaniego/bringing-carla-to-openenv-trl reacted to danielhanchen's post with 🔥 about 2 months ago
We collaborated with Hugging Face to enable you to train MoE models 12× faster with 35% less VRAM via our new Triton kernels (no accuracy loss). 🤗
Train gpt-oss locally on 12.8GB VRAM with our free notebooks: https://unsloth.ai/docs/new/faster-moe