1 15 4

Shantanu Acharya

shantanuacharya

https://www.shantanuacharya.com/

AI & ML interests

Large Language Models and Computer Vision

Recent Activity

upvoted an article 2 months ago

Nemotron 3 Nano 4B: A Compact Hybrid Model for Efficient Local AI

upvoted an article 2 months ago

Introducing SPEED-Bench: A Unified and Diverse Benchmark for Speculative Decoding

authored a paper 5 months ago

NVIDIA Nemotron 3: Efficient and Open Intelligence

View all activity

Organizations

upvoted 2 articles 2 months ago

Article

Nemotron 3 Nano 4B: A Compact Hybrid Model for Efficient Local AI

nvidia

•

Mar 17

• 64

Article

Introducing SPEED-Bench: A Unified and Diverse Benchmark for Speculative Decoding

nvidia

•

Mar 19

• 47

authored 2 papers 5 months ago

NVIDIA Nemotron 3: Efficient and Open Intelligence

Paper • 2512.20856 • Published Dec 24, 2025 • 44

Nemotron 3 Nano: Open, Efficient Mixture-of-Experts Hybrid Mamba-Transformer Model for Agentic Reasoning

Paper • 2512.20848 • Published Dec 23, 2025 • 43

upvoted a paper 5 months ago

NVIDIA Nemotron 3: Efficient and Open Intelligence

Paper • 2512.20856 • Published Dec 24, 2025 • 44

upvoted an article 11 months ago

Article

SmolLM3: smol, multilingual, long-context reasoner

eliebak, cmpatino, anton-l, edbeeching, m-ric, nouamanetazi, akseljoonas, guipenedo, hynky, clefourrier, SaylorTwift, kashif, qgallouedec, hlarcher, glutamatt, Xenova, reach-vb, ngxson, craffel, lewtun, loubnabnl, lvwerra, thomwolf

•

Jul 8, 2025

• 777

upvoted a paper 12 months ago

Thinkless: LLM Learns When to Think

Paper • 2505.13379 • Published May 19, 2025 • 50

authored 2 papers about 1 year ago

SWAN-GPT: An Efficient and Scalable Approach for Long-Context Language Modeling

Paper • 2504.08719 • Published Apr 11, 2025

Nemotron-H: A Family of Accurate and Efficient Hybrid Mamba-Transformer Models

Paper • 2504.03624 • Published Apr 4, 2025 • 18

upvoted a paper about 1 year ago

Nemotron-H: A Family of Accurate and Efficient Hybrid Mamba-Transformer Models

Paper • 2504.03624 • Published Apr 4, 2025 • 18

authored a paper over 1 year ago

Star Attention: Efficient LLM Inference over Long Sequences

Paper • 2411.17116 • Published Nov 26, 2024 • 53

upvoted a paper over 1 year ago

Star Attention: Efficient LLM Inference over Long Sequences

Paper • 2411.17116 • Published Nov 26, 2024 • 53

commented a paper over 1 year ago

Star Attention: Efficient LLM Inference over Long Sequences

Paper • 2411.17116 • Published Nov 26, 2024 • 53 •

upvoted 2 papers over 1 year ago

nGPT: Normalized Transformer with Representation Learning on the Hypersphere

Paper • 2410.01131 • Published Oct 1, 2024 • 10

NVLM: Open Frontier-Class Multimodal LLMs

Paper • 2409.11402 • Published Sep 17, 2024 • 74

liked 3 models almost 2 years ago

upvoted a paper almost 2 years ago

HelpSteer2: Open-source dataset for training top-performing reward models

Paper • 2406.08673 • Published Jun 12, 2024 • 20

upvoted a paper about 2 years ago

NeMo-Aligner: Scalable Toolkit for Efficient Model Alignment

Paper • 2405.01481 • Published May 2, 2024 • 30

Shantanu Acharya

AI & ML interests

Recent Activity

Organizations