Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up

All HF Hub posts

SeaWolf-AI 
posted an update 1 day ago
view post
Post
1498
🧬 Darwin-35B-A3B-Opus — The Child That Surpassed Both Parents

What if a merged model could beat both its parents? We proved it can.
Darwin-35B-A3B-Opus is a 35B MoE model (3B active) built with our Darwin V5 engine — the first evolution system that CT-scans parent models before merging them.
🤗 Model: FINAL-Bench/Darwin-35B-A3B-Opus

The result speaks for itself: GPQA Diamond 90.0%, versus Father (Qwen3.5-35B-A3B) at 84.2% and Mother (Claude 4.6 Opus Distilled) at 85.0%. That's +6.9% over Father and +5.9% over Mother. Not a tradeoff — a genuine leap. Meanwhile, MMMLU sits at 85.0% (Father: 85.2%), multimodal is fully intact, and all 201 languages are preserved.

How? Model MRI changed everything. Traditional merging is guesswork. Darwin V4 added evolution. Darwin V5 added X-ray vision. Model MRI scans each parent layer by layer and discovers: Mother's L34–L38 is the reasoning engine (peak cosine distance), 50–65% of Mother's experts are dead (killed by text-only distillation), and Father is a healthy generalist with every expert alive. The prescription: transplant Mother's reasoning brain at L38 (90% weight), replace her dead experts with Father's living ones, and let Father's router handle the output layer. Reasoning went up. Versatility stayed intact. No tradeoff — just evolution.

35B total, 3B active (MoE) · GPQA Diamond 90.0% · MMMLU 85.0% (201 languages) · Multimodal Image & Video · 262K native context · 147.8 tok/s on H100 · Runs on a single RTX 4090 (Q4) · Apache 2.0
Darwin V5's full algorithm and technical details will be released alongside an upcoming paper.

🚀 Live Demo: FINAL-Bench/Darwin-35B-A3B-Opus

🏆 FINAL Bench Leaderboard: FINAL-Bench/Leaderboard

📊 ALL Bench Leaderboard: FINAL-Bench/all-bench-leaderboard

Built by VIDRAFT · Supported by the Korean Government GPU Support Program
  • 1 reply
·
danielhanchen 
posted an update 1 day ago
view post
Post
1599
A new way to use Unsloth.

Coming soon...
MikeDoes 
posted an update about 23 hours ago
view post
Post
895
Things our clients and open source actually said to us this year:

"Finally, someone built a synthetic PII training data for German."

"Does it cover have localised information? Not just the language, the actual format. That must have been a lot of work that we can save from our side."

"We operate in 12 EU countries. Your dataset is the only one that covers all of them which has helped us out a lot in compliance especially because it's synthetic."

Every language has strong PII localization names, addresses, IDs, phone numbers, dates in the real format of that country.

23 languages. 29 regions. 3 scripts. 1,428,143 examples.

100% synthetic. Zero real personal data. Free on Hugging Face.
sergiopaniego 
posted an update 1 day ago
view post
Post
1112
TRL is officially an adult 🥳

excited to announce TRL v1.0❗️

head to the blog to see how we got here and what’s next for this post-training library, designed to keep pace with the field

https://huggingface.co/blog/trl-v1
  • 2 replies
·
alibidaran 
posted an update 1 day ago
view post
Post
1213
🧠 Introducing Qwen3.5 — Cognitive Reasoning Mode

I fine-tuned Qwen2.5 with GRPO to actually think before it answers — not just pattern-match.

Most LLMs mimic reasoning. This one builds a real cognitive path:

📌 Plan → understand the task
🔍 Monitor → reason step by step
✅ Evaluate → verify before answering

Every response follows a strict structured protocol:
<think> <planning> ... <monitoring> ... <evaluation> ... </think>
Then a clean, reasoning-free <output>.

The model self-checks its own structure. If a section is missing or malformed → the response is invalid.

This isn't chain-of-thought slapped on top. The reasoning protocol is baked in via RL.

🔗 Full README + inference code below 👇
alibidaran/Qwen_COG_Thinker_Merged

#AI #LLM #Qwen #ReasoningModels #GRPO #OpenSource
reaperdoesntknow 
posted an update 1 day ago
view post
Post
1276
Your Loss Function Has Singularities. Classical Calculus Can't See Them.

Introducing Discrepancy Calculus (DISC) — treating training singularities as structure, not noise.

Loss plateaus, mode collapse, catastrophic forgetting, distilled models that know things the teacher never taught — we engineer around these. But what if those singularities are the actual structure of the learning problem?

The core insight: Every BV function decomposes into smooth (what classical calculus handles), jump (capability emergence, loss plateaus breaking), and Cantor (ghost imprinting — knowledge transferring through weight-space topology, not gradient signal). Classical analysis sees only the first. DISC sees all three.

The paper proves this isn't alternative notation — it's strictly larger. The Meta-Discrepancy Theorem: where singularities exist, the classical FTC/MVT/chain-rule package is provably impossible.

What it explains:

TopologicalQwen exhibited literary reasoning from physics-only data — the Cantor part explains how. DualMind's Explore→Examine→Response loop operationalizes DISC as inference dynamics. 50 models, 35K+ downloads, all built on this framework.

Paper: Discrepancy Calculus: Foundations and Core Theory (DOI: 10.57967/hf/8194) — 8 axioms, proofs, computational recipes.

Series: Structure Over Scale (DOI: 10.57967/hf/8165) → Three Teachers to Dual Cognition (DOI: 10.57967/hf/8184) → DISC Foundations

— Roy S. Colca Jr., Convergent Intelligence LLC: Research Division
qgallouedec 
posted an update 1 day ago
view post
Post
1346
TRL v1.0 is out!

Hugging Face's TRL library is downloaded 3 million times a month. Over 130k models trained with it are public on the Hub, and major projects like @unsloth and @axolotl-ai-co build directly on top of it. v1.0 is the moment we acknowledged that responsibility explicitly, with a real stability contract.

The field hasn't settled. Building stable software in a domain that keeps invalidating its own assumptions is the actual problem we're solving. The answer is a design that can absorb the next shift without breaking what people rely on.

What's in v1.0:
Deep Hugging Face integration, low infrastructure burden
What's next: asynchronous GRPO, better scaling support, and making training legible enough that agents can inspect and steer it.

pip install --upgrade trl


Read more: hf.co/blog/trl-v1
OzTianlu 
posted an update 2 days ago
view post
Post
1301
https://github.com/lizixi-0x2F/March
I just released March, an open-source high-performance KV cache sharing library for LLM inference that uses Trie-based prefix deduplication.
When you run LLM services, you often see thousands of requests sharing the same system prompt and conversation history. But traditional KV cache systems store each sequence separately — duplicating the exact same data over and over again. Pure waste.
March uses a Trie structure to automatically detect and reuse identical token prefixes. Instead of storing [system_prompt + history] 1000 times, it's stored once. Everyone shares it.
- 80-97% memory reduction in prefix-heavy workloads (tested on SmolLM2-135M with 500 multi-turn conversations)
- Zero-copy queries — returns direct pointers into the memory pool, no expensive memcpy on the hot path
- Predictable memory usage — fixed-size page pool with O(L) complexity
- Trade-off: slightly slower than dict O(1) lookup, but the memory savings are worth it in production
  • 1 reply
·
DedeProGames 
posted an update about 13 hours ago
view post
Post
424
🔥 GRM2 - The small one that surpasses the big ones.
What if a 3-parameter model can beat a 32-parameter model in every benchmark? We prove that it can.
GRM2 is a 3b params model based on the llama architecture, trained for long reasoning and high performance in complex tasks - the first 3b params model to outperform qwen3-32b in ALL benchmarks, and outperform o3-mini in almost all benchmarks.
🤗 Model: OrionLLM/GRM2-3b
The first 3b params model to generate over 1000 lines of code and achieve a score of 39.0 in xBench-DeepSearch-2510.

🚀 Chat with GRM:
DedeProGames/GRM2-Chat

🏆 Download official GGUFs: OrionLLM/GRM2-3b-GGUF
ArtelTaleb 
posted an update about 22 hours ago
view post
Post
908
Hello 3D lover !

You have a 3D model sitting somewhere and just want to see it
no software to install, no account to create.

Canva3D is a free browser viewer built for that exact moment.

Drop any format — GLB, OBJ, FBX, STL, USDZ — it just loads.
Swap the background with HDRI lighting to set the mood of your scene.
Record a video ready to share — orbit, animate, export.

That's it. No settings to configure. No GPU required.

______________________________________________________

Bonjour les amoureux de la 3D

T'as un modèle 3D quelque part et tu veux juste le voir sans logiciel à
installer, sans compte à créer.

Canva3D est un viewer gratuit dans le navigateur, fait exactement pour ça.

Glisse n'importe quel format — GLB, OBJ, FBX, STL, USDZ —
il charge direct.
Change le fond avec un éclairage HDRI pour poser l'ambiance de ta scène.
Enregistre une vidéo prête à partager — orbite, anime, exporte.

C'est tout. Rien à configurer. Pas besoin de GPU.



👉 ArtelTaleb/canva3d