🤝 Open to Collab

1 1 2

Cihangir Bozdogan

Cihangirbozdogan

AI & ML interests

None yet

Recent Activity

new activity 7 days ago

zaakirio/gemma-4-12b-it-uncensored-GGUF:MTP

liked a model 23 days ago

zaakirio/LFM2.5-1.2B-Instruct-Uncensored

liked a model about 1 month ago

NemoStation/Marlin-2B

View all activity

Organizations

None yet

New activity in zaakirio/gemma-4-12b-it-uncensored-GGUF 7 days ago

MTP

👍 1

#1 opened 20 days ago by

neoOpus

liked a model 23 days ago

zaakirio/LFM2.5-1.2B-Instruct-Uncensored

Text Generation • 1B • Updated 23 days ago • 196 • 2

liked a model about 1 month ago

NemoStation/Marlin-2B

Video-Text-to-Text • 2B • Updated 25 days ago • 19.4k • 543

replied to Crownelius's post about 2 months ago

That curiosity is the best fuel there is. Looking forward to seeing CompactAI 👀

reacted to Crownelius's post with 🔥 about 2 months ago

Post

5956

My Huggingface journey has been a trip!
I wanted to take the time to thank each and every one of you for using my dataset and getting it to go as far as it did. Believe it or not, some neanderthal was and maybe still is trending on huggingface.

Not only did my dataset reach number one, my fine-tuned qwen3.5 model did as well. Top 10. Honestly, ain't much left to do here.

Y'all have given me the desire, no... the craving for more. I am absolutely obsessed with AI now. I want to tweak it... I want to take it apart, just to see what makes everything tick. I want to put it together like Frankenstein and his monster.

The only thing that's stopping this guy is compute. I don't mind spending every penny I have on this. I desperately want to drive AI forward, even just a little bit.

I never knew the clanker hater from a year ago would be saying this.

Thank you all from the bottom of my heart.

Looking forward to showing you what I'm cooking up next. @CompactAI is your only hint!

3 replies

reacted to SeaWolf-AI's post with ❤️ about 2 months ago

Post

5098

🌌 Introducing Model Galaxy — a Living, Multimodal Fork of the HF Model Atlas

👉 Try it: FINAL-Bench/model-galaxy

This Space is a fork of the brilliant Eliahu/Model-Atlas, the official demo of "Charting and Navigating Hugging Face's Model Atlas" (Horwitz et al., arXiv 2503.10633). Their pre-computed HF model graph is the foundation of every node and edge you see, and we are deeply grateful for its open release.

The original atlas is a static snapshot of early 2025. Model Galaxy turns it into a living, multimodal map. We injected the 2026 trending originals that did not exist when the atlas was frozen — DeepSeek-V4, Hy3-preview, GLM-5.1, Kimi-K2, gpt-oss, Nemotron-3 Super / Nano / Omni, Hermes-4.3, Qwen3-Coder-Next, Llama-3.3, Granite-4.1, plus the latest multimodal releases (FLUX.2, ERNIE-Image, HunyuanImage / Video, LTX-2.3, Wan2.2, Kokoro-82M, VoxCPM2, Voxtral-TTS, whisper-v3-turbo, Gemma-4, Qwen3-Omni, Phi-4-mm) — each with proper base_model lineage edges.

We also added the complete VIDRAFT Darwin family ontology: 120 nodes covering Darwin Core, AETHER, every brand variant (Rogue, AWAXIS, TenOS, Warecube), NOESIS-Darwin multimodal extensions, and 40+ community quantizations — the most complete Darwin lineage view anywhere.

The name "Galaxy" is now literal: our three injected clusters are re-laid out as logarithmic spiral galaxies, with bigger models near the bright cores and quantizations scattering to the outer arms — just like real star mass distribution. A top-right toggle switches between Galaxy mode (deep-space gradient with 220 animated stars) and Atlas mode (clean white panels for reports). A 15-second progress bar narrates the render, and per-modality / per-company colors make every cluster legible at a glance.

Final scale: 22,480 nodes in the default Modalities atlas, 137,324 in the Large NLP atlas, and a 277-node compact Darwin + Trending view for instant exploration. Feedback and PRs welcome.

reacted to DavidAU's post with ❤️ about 2 months ago

Post

22425

Uncensored, Heretic, Qwen 3.6 27B GGUFs - Exceeds all quant metrics and core model metrics too.

Tuned 27B Heretic Uncensored quants from IQ2M to Q8.
IQ2M is 83% of BF16, with Q6 just under 98% of BF16 precision.
Q8: 98.47% of BF16 precision.
NEO/Code DI-Imatrix Quants.

Exceeds all 5 metrics for "censored" quants too.

All metrics posted.

Tuned model -from which the quants were built- also exceeds Qwen 3.6 27B core metrics too.

DavidAU/Qwen3.6-27B-Heretic-Uncensored-FINETUNE-NEO-CODE-Di-IMatrix-MAX-GGUF

9 replies

reacted to Crownelius's post with 🔥 about 2 months ago

Post

3841

[DAY ONE] PROJECT CROWFEATHER 4/30/2026
...The day I forgot to attach wandb.ai
Just dropped Crowfeather-50m, the first checkpoint in a series, and yeah, no graphs.

Crowfeather/Crowfeather-50m

54.5M params. Pretrain only. 17,500 steps banked on FineWeb-edu before Thunder credits ran dry. About 2.3B tokens, no SFT yet.

Architecture: Gemma-4 alternating sliding/global attention (1024 window, last layer always global) plus DeepSeek-V4 Muon optimizer plus WSD scheduler plus Gemma-2 logit soft-cap plus PaLM z-loss. Recipe in the model card.

What it can do: writes grammatical English. Knows that France has Rhine-adjacent monasteries (it picked Rouen instead of Paris but the vocabulary is in there). Tells stories about Mr. Fabien.

What it can't do yet: facts, code, math. Base LM, no SFT, no instruction tuning.

The series:
Every additional training run becomes another model card here
Every model card gets a matching post on this profile
Continuation goes to Colab next, picking up from step 17500 out of 100k

Limited to one post a day on Hugging Face, so updates will trickle out at that pace. Follow [@Crownelius](@Crownelius ) and [@Crowfeather](

Crowfeather ) if you want to watch this thing learn in public. Next drop will either come with the finished pre-train or whatever step I land on before the bank takes my credit card away.

Graphs will be available on my NEXT model lol

-Shane

3 replies

commented on Keep the Tokens Flowing: Lessons from 16 Open-Source RL Libraries 2 months ago

What stands out: all 16 libraries converged on the same disaggregated architecture, but diverged sharply on staleness management and the hybrid (depth bounding + optional IS correction) trend feels right. Per-sample model_version tagging is the pragmatic foundation; once you have it, every other staleness strategy becomes a policy choice rather than an architectural rewrite. The MoE training-inference mismatch is the sleeper insight "Keep Routing" and "Keep Sampling Mask" stop being optimizations and become correctness requirements. Excellent survey.

commented on Gemma 4 VLA Demo on Jetson Orin Nano Super 2 months ago

Love the design choice of a single look_and_answer tool and letting Gemma 4 decide when vision is actually needed — much cleaner than always-on vision encoders. And going native llama.cpp over Docker is the right call if you want to actually swap the mmproj. A 5B Q4_K_M doing tool-routed multimodal reasoning on 8GB unified memory, with Parakeet + Kokoro on-device, is a strong signal for where edge agents are heading. Nice work.

upvoted an article 2 months ago

Article

Gemma 4 VLA Demo on Jetson Orin Nano Super

nvidia

•

Apr 22

• 10

Cihangir Bozdogan

AI & ML interests

Recent Activity

Organizations

Cihangirbozdogan's activity

MTP

Gemma 4 VLA Demo on Jetson Orin Nano Super