This model is a MoE created by merging a bunch of Qwen3 1.7B variants totalling 32 experts. It has experts which span across logic, reasoning, coding, mathematics, psychology, conversation, creativity and roleplaying.
This model runs pretty fast, I get 30 t/s when offloading entirely to CPU but your usage might vary.
As a Qwen3 model it has the capability to use think blocks which can be disabled by entering "/no_think" into the system prompt.
- Downloads last month
- 4
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support