Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
devilman's picture
26

devilman

devilmanz
shtefcs's profile picture yomir's profile picture
ยท
  • devilmanz

AI & ML interests

None yet

Recent Activity

liked a model 14 days ago
magiccodingman/Qwen3-4B-Thinking-2507-Unsloth-MagicQuant-Hybrid-GGUF
liked a model 2 months ago
Qwen/Qwen3-VL-4B-Thinking
reacted to codelion's post with ๐Ÿ”ฅ 4 months ago
I recently added a recipe in ellora to improve reasoning capabilities to Gemma-3-1B using self-supervised learning. Model now shows step-by-step thinking in <think> tags before answering. Logic puzzle accuracy: 61% โ†’ 84%. 3 hours training on single GPU. ๐Ÿง  Used GRPO where model generates multiple responses and learns to prefer better reasoning. Works surprisingly well for making smaller models more transparent. ๐Ÿ”— Colab: https://colab.research.google.com/github/codelion/ellora/blob/main/Ellora_Recipe_2_Reasoning_LoRA_with_Self-Rewarding_GRPO.ipynb ๐Ÿค— Model: https://huggingface.co/codelion/gemma-3-1b-it-reasoning-grpo-lora ๐Ÿ’ป Code: https://github.com/codelion/ellora
View all activity

Organizations

None yet

devilmanz 's datasets

None public yet
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs