Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Buckets new
  • Docs
  • Enterprise
  • Pricing
    • Website
      • Tasks
      • HuggingChat
      • Collections
      • Languages
      • Organizations
    • Community
      • Blog
      • Posts
      • Daily Papers
      • Learn
      • Discord
      • Forum
      • GitHub
    • Solutions
      • Team & Enterprise
      • Hugging Face PRO
      • Enterprise Support
      • Inference Providers
      • Inference Endpoints
      • Storage Buckets

  • Log In
  • Sign Up
hanzlajavaid's picture
6 2 15

hanzlajavaid PRO

hanzla
blubbyboy's profile picture usama7871's profile picture khaled1997's profile picture
·

AI & ML interests

Direct Preference Optimization, Supervised Finetuning, Stable Diffusion

Recent Activity

posted an update about 1 month ago
Reinforcement learning can sometimes lead to emergent behavior through much simpler training setups compared to large scale pre-training. I explored this idea by running a small GRPO experiment on Qwen3.5 4B, and the results were pretty exciting. Hypothesis: improving visual mathematical reasoning may also improve the model’s ability to transcribe LaTeX from images. I wrote a short breakdown of the experiment here: https://hanzlajavaid.github.io/blog/grpo-experiment-exploring-emergent-properties/
updated a model about 1 month ago
hanzla/Qwen3.5-4B-mathvista-GRPO
published a model about 1 month ago
hanzla/Qwen3.5-4B-mathvista-GRPO
View all activity

Organizations

ZeroGPU Explorers's profile picture Journalists on Hugging Face's profile picture MLX Community's profile picture ModularityAI's profile picture Social Post Explorers's profile picture OpenAI gpt-oss Grants's profile picture

upvoted a paper about 1 year ago

DAPO: An Open-Source LLM Reinforcement Learning System at Scale

Paper • 2503.14476 • Published Mar 18, 2025 • 146
upvoted a paper over 1 year ago

Mini-Omni: Language Models Can Hear, Talk While Thinking in Streaming

Paper • 2408.16725 • Published Aug 29, 2024 • 53
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs