Spaces:

IndianAIDevs
/

README

Running

App Files Files Community

Let's Talk about AI

pinned

by kalashshah19 - opened Aug 20, 2025

Discussion

kalashshah19

Indian AI Developers org Aug 20, 2025

•

edited Aug 22, 2025

Hello, here is an open space for everyone to talk, share, ask and show anything about AI.

kalashshah19 pinned discussion Aug 20, 2025

kalashshah19

Indian AI Developers org Aug 22, 2025

Has anyone pre-trained LLM model from scratch ? If yes then share your experience, things to consider while training, notes, tips etc.

Shashank2k3

Indian AI Developers org Aug 22, 2025

Hi i am also intrested into LLM Model , i am about to start this reserach from next week please give any inputs

kalashshah19

Indian AI Developers org Aug 22, 2025

Hi i am also intrested into LLM Model , i am about to start this reserach from next week please give any inputs

Hey @Shashank2k3 , if you want your own LLM model, first you need huge data. You can start with fine tuning already available good LLM models like Gemma, Phi, LLAMA, mistral etc with your dataset. Start with small models of sizes like 4 to 7B parameters. For pre-training LLM from scratch you need enormous data, good resources like heavy duty GPUs and CPUs and also have knowledge of training techniques, NLP, etc . You can always brainstorm with ChatGPT to get more knowledge.

Shashank2k3

Indian AI Developers org Aug 24, 2025

Hey @kalashshah19 , thanks for the input! I already have a solid foundation in these areas from my Bachelor's degree in AIML, and now I’m looking to dive deeper into the world of LLMs.

kalashshah19

Indian AI Developers org Aug 26, 2025

Hey @kalashshah19 , thanks for the input! I already have a solid foundation in these areas from my Bachelor's degree in AIML, and now I’m looking to dive deeper into the world of LLMs.

Great !

Shashank2k3

Indian AI Developers org Aug 31, 2025

Yupp so what you guys do, i mean profession!!!

kalashshah19

Indian AI Developers org Sep 1, 2025

Yupp so what you guys do, i mean profession!!!

I am an Associate Data Scientist at Casepoint.
What about you ?

95 hidden messages

Expand all

kalashshah19

Indian AI Developers org Jan 6

hey everyone,
read my blog on NPU and openVINO toolkit

https://huggingface.co/blog/Neural-Hacker/openvino

Sure !

kalashshah19

Indian AI Developers org Jan 6

Happy New year

Happy New Year !

Neural-Hacker

Indian AI Developers org Jan 7

@JDhruv14 Amazing work brother
https://tatva.info/ is just amazing. Would love to work on something together 🤝

JDhruv14

Indian AI Developers org Jan 7

@JDhruv14 Amazing work brother
https://tatva.info/ is just amazing. Would love to work on something together 🤝

Woahhh thanks. I had forgotten about sharing about my latest project tatva. How did you found about it? Also, if possible please vote for me on Peerlist. I want to make tatva better that's the aim.

Neural-Hacker

Indian AI Developers org Jan 7

@JDhruv14 Amazing work brother
https://tatva.info/ is just amazing. Would love to work on something together 🤝

Woahhh thanks. I had forgotten about sharing about my latest project tatva. How did you found about it? Also, if possible please vote for me on Peerlist. I want to make tatva better that's the aim.

i forgot from where i found it but yesterday i noticed the developer name and saw the posts on X, thought i should promote it. i upvoted tatva on producthunt & peerlist

Neural-Hacker

Indian AI Developers org 9 days ago

https://huggingface.co/Shaligram-Dewangan/Dhi-5B-Base

my senior (3rd year) trained this model from scratch

Neural-Hacker

Indian AI Developers org 2 days ago

Sarvam AI launches Indus, an interface powered by their 105B model
try here: https://indus.sarvam.ai
Also, you can use its mobile app here: https://play.google.com/store/apps/details?id=ai.sarvam.indus

Neural-Hacker

Indian AI Developers org 2 days ago

New Open Model Release — Param2-17B-A2.4B (Thinking)

BharatGen released a reasoning-optimized MoE model designed for multilingual + Indic intelligence at the India AI Impact Summit.

Why it’s interesting:
• Strong performance across reasoning, factual QA & Indic benchmarks
• MoE architecture → ~2.4B active params → efficient inference
• Built for culturally grounded & multilingual tasks
• Solid balance vs reasoning-heavy DeepSeek R1

This signals serious progress toward sovereign, deployment-ready AI for Indian languages and real-world workflows.

Model: https://huggingface.co/bharatgenai/Param2-17B-A2.4B-Thinking

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment