I haven't done evaluation benchmarks on the Gemma 4 yet, this space is here for comparison.
Ali Bidaran PRO
alibidaran
AI & ML interests
AI resercher, LLMs, Computer Vision, Generative AI, NLP, Machine /Deep learning, Reinforcement Learning
Recent Activity
updated a model 3 days ago
alibidaran/MITI_GPT_OSS_20BV1 published a model 3 days ago
alibidaran/MITI_GPT_OSS_20BV1 new activity 5 days ago
alibidaran/Qwen_COG_Thinker_Merged:BenchmarkOrganizations
None yet
replied to their post 12 days ago
posted an update 14 days ago
Post
4057
With the release of Gemma 4, I launched a new Space called MEDPAI — a medical imaging analysis tool that combines object detection with multimodal AI.
Here's how it works:
Upload a CT scan or X-ray
Computer vision models detect and annotate findings
Gemma 4 33B generates a report or answers your questions about the image
Currently available detectors: dental analysis and bone fracture detection.
More models are in the pipeline — follow the Space to stay updated!
alibidaran/MEDPAI
Here's how it works:
Upload a CT scan or X-ray
Computer vision models detect and annotate findings
Gemma 4 33B generates a report or answers your questions about the image
Currently available detectors: dental analysis and bone fracture detection.
More models are in the pipeline — follow the Space to stay updated!
alibidaran/MEDPAI
replied to their post 15 days ago
Why GPT-4o?
posted an update 17 days ago
Post
2053
🧠 Introducing Qwen2.5 — Cognitive Reasoning Mode
I fine-tuned Qwen2.5 with GRPO to actually think before it answers — not just pattern-match.
Most LLMs mimic reasoning. This one builds a real cognitive path:
📌 Plan → understand the task
🔍 Monitor → reason step by step
✅ Evaluate → verify before answering
Every response follows a strict structured protocol:
<think> <planning> ... <monitoring> ... <evaluation> ... </think>
Then a clean, reasoning-free <output>.
The model self-checks its own structure. If a section is missing or malformed → the response is invalid.
This isn't chain-of-thought slapped on top. The reasoning protocol is baked in via RL.
🔗 Full README + inference code below 👇
alibidaran/Qwen_COG_Thinker_Merged
#AI #LLM #Qwen #ReasoningModels #GRPO #OpenSource
I fine-tuned Qwen2.5 with GRPO to actually think before it answers — not just pattern-match.
Most LLMs mimic reasoning. This one builds a real cognitive path:
📌 Plan → understand the task
🔍 Monitor → reason step by step
✅ Evaluate → verify before answering
Every response follows a strict structured protocol:
<think> <planning> ... <monitoring> ... <evaluation> ... </think>
Then a clean, reasoning-free <output>.
The model self-checks its own structure. If a section is missing or malformed → the response is invalid.
This isn't chain-of-thought slapped on top. The reasoning protocol is baked in via RL.
🔗 Full README + inference code below 👇
alibidaran/Qwen_COG_Thinker_Merged
#AI #LLM #Qwen #ReasoningModels #GRPO #OpenSource
posted an update 2 months ago
Post
3018
I’m excited to share PlaiTO, a reasoning-focused language model built on LLaMA 3.1 (8B) and optimized for humanities and social sciences.
PlaiTO is designed to go beyond surface-level text generation, emphasizing structured reasoning, conceptual clarity, and analytical depth—especially in domains centered on human behavior and social systems.
🎯 Focus Areas
Psychology
Management & Organizational Studies
Sociology
📊 MMLU Benchmark Results (100 samples per domain)
Professional Psychology: 76%
Management: 74%
Sociology: 75%
These results highlight PlaiTO’s strong performance in abstract, theory-heavy, and reasoning-driven tasks.
💡 Why PlaiTO?
Strong analytical and reasoning capabilities
Better handling of complex human-centered problems
Suitable for academic, educational, and research use cases
Balanced performance across multiple humanities disciplines
PlaiTO is ideal for conceptual analysis, case reasoning, academic discussion, and decision-support scenarios—while still requiring human oversight for high-stakes applications.
📌 Built on LLaMA 3.1, compliant with its licensing terms.
alibidaran/Platio_merged_model
PlaiTO is designed to go beyond surface-level text generation, emphasizing structured reasoning, conceptual clarity, and analytical depth—especially in domains centered on human behavior and social systems.
🎯 Focus Areas
Psychology
Management & Organizational Studies
Sociology
📊 MMLU Benchmark Results (100 samples per domain)
Professional Psychology: 76%
Management: 74%
Sociology: 75%
These results highlight PlaiTO’s strong performance in abstract, theory-heavy, and reasoning-driven tasks.
💡 Why PlaiTO?
Strong analytical and reasoning capabilities
Better handling of complex human-centered problems
Suitable for academic, educational, and research use cases
Balanced performance across multiple humanities disciplines
PlaiTO is ideal for conceptual analysis, case reasoning, academic discussion, and decision-support scenarios—while still requiring human oversight for high-stakes applications.
📌 Built on LLaMA 3.1, compliant with its licensing terms.
alibidaran/Platio_merged_model
reacted to prithivMLmods's post with ❤️ 4 months ago
Post
4293
Introducing the Qwen-Image-Edit-2511-LoRAs-Fast demo, featuring image property comparison and contrast, built on top of Gradio and the combined Rerun SDK. It supports single and multi-image edits with existing LoRAs that are lazily loaded. (Note: This is still an experimental Space for Qwen-Image-Edit-2511.)
⭐ Space Demo: prithivMLmods/Qwen-Image-Edit-2511-LoRAs-Fast
⭐ GitHub: https://github.com/PRITHIVSAKTHIUR/Qwen-Image-Edit-2511-LoRAs-Fast-Multi-Image-Rerun
⭐ Collection: https://huggingface.co/collections/prithivMLmods/image-generation-apps-collection
To know more about it, visit the app page or the respective model page!
⭐ Space Demo: prithivMLmods/Qwen-Image-Edit-2511-LoRAs-Fast
⭐ GitHub: https://github.com/PRITHIVSAKTHIUR/Qwen-Image-Edit-2511-LoRAs-Fast-Multi-Image-Rerun
⭐ Collection: https://huggingface.co/collections/prithivMLmods/image-generation-apps-collection
To know more about it, visit the app page or the respective model page!
posted an update 5 months ago
Post
2651
This shared notebook comprises the MMLU benchmark evaluating task for my latest reasoning model for the sociology field. The results show that using Few-shot prompting in the system prompt can significantly improve the model's performance at answering questions.
Model's link:
alibidaran/GRPO_LLAMA3-instructive_reasoning1
Notebook evaluation:
https://www.kaggle.com/code/alibidaran/mmlu-socialogy-thinking-evals?scriptVersionId=277240033
Model's link:
alibidaran/GRPO_LLAMA3-instructive_reasoning1
Notebook evaluation:
https://www.kaggle.com/code/alibidaran/mmlu-socialogy-thinking-evals?scriptVersionId=277240033
posted an update 7 months ago
Post
450
Hi community, I recently added my LORA base SDX-1.5 model, which is trained on the Midjourney dataset with an image size of 512x512. You can download it and use it by following the link.
alibidaran/LORA_SDX1.5_Midjourney
alibidaran/LORA_SDX1.5_Midjourney
reacted to sergiopaniego's post with 👍 9 months ago
Post
4566
Just included example scripts for aligning models using GSPO (including VLM example) 🙆♂️🙆♂️
GSPO is the latest RL alignment algo by @Alibaba_Qwen and it's already supported in the latest TRL v0.20 release.
Super-easy-to-get-started example scripts below, GO run them!👩💻👩💻
🧑🎨 Script: https://github.com/huggingface/trl/blob/main/examples/scripts/gspo.py
🦄 VLM script: https://github.com/huggingface/trl/blob/main/examples/scripts/gspo_vlm.py
🧩 More TRL examples: https://huggingface.co/docs/trl/main/en/example_overview
🧙♂️ GSPO paper: Group Sequence Policy Optimization (2507.18071)
GSPO is the latest RL alignment algo by @Alibaba_Qwen and it's already supported in the latest TRL v0.20 release.
Super-easy-to-get-started example scripts below, GO run them!👩💻👩💻
🧑🎨 Script: https://github.com/huggingface/trl/blob/main/examples/scripts/gspo.py
🦄 VLM script: https://github.com/huggingface/trl/blob/main/examples/scripts/gspo_vlm.py
🧩 More TRL examples: https://huggingface.co/docs/trl/main/en/example_overview
🧙♂️ GSPO paper: Group Sequence Policy Optimization (2507.18071)
posted an update 9 months ago
Post
455
Hello commiunity.
Here is latest LLM-based TTS model. This model is fined tuned on Spark TTs model with AnimeVox dataset.
Here is the demo for how to use the this model.
alibidaran/Spark_Anime
If you want to use this model in your model you can directly download it from link below:
alibidaran/Spark_Anime
Here is latest LLM-based TTS model. This model is fined tuned on Spark TTs model with AnimeVox dataset.
Here is the demo for how to use the this model.
alibidaran/Spark_Anime
If you want to use this model in your model you can directly download it from link below:
alibidaran/Spark_Anime