Inference Providers
Active filters: modelopt
nvidia/Gemma-4-31B-IT-NVFP4
Text Generation
• 21B • Updated • 366k
• 315
nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-NVFP4
Text Generation
• 67B • Updated • 1.71M
• 260
bg-digitalservices/Gemma-4-26B-A4B-it-NVFP4
Text Generation
• 15B • Updated • 32.1k
• 17
nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-FP8
Text Generation
• 124B • Updated • 1.09M
• 227
cosmicproc/gemma-4-E4B-it-NVFP4
Image-Text-to-Text
• 6B • Updated • 5.24k
• 5
nvidia/Qwen3.5-397B-A17B-NVFP4
Text Generation
• Updated • 427k
• 88
Alexzander85/Qwen3.5-9B-Claude-4.6-Opus-Reasoning-Distilled-NVFP4-MLP-FP8KV
Text Generation
• 8B • Updated • 1.38k
• 9
nvidia/MiniMax-M2.5-NVFP4
Text Generation
• 116B • Updated • 33.1k
• 21
bg-digitalservices/Gemma-4-26B-A4B-it-NVFP4A16
Text Generation
• 15B • Updated • 3.77k
• 4
nvidia/Qwen3-30B-A3B-NVFP4
Text Generation
• 16B • Updated • 215k
• 28
Text Generation
• Updated • 798k
• 78
Image-Text-to-Text
• 7B • Updated • 41.8k
• 11
Text Generation
• 183B • Updated • 3.44k
• 13
NVFP4/Qwen3-Coder-30B-A3B-Instruct-FP4
Text Generation
• 16B • Updated • 19.8k
• 23
Text Generation
• 8B • Updated • 246k
• 7
nvidia/Qwen3-Next-80B-A3B-Instruct-NVFP4
Text Generation
• Updated • 20.5k
• 37
chankhavu/Nemotron-Cascade-2-30B-A3B-NVFP4
Text Generation
• 16B • Updated • 14.1k
• 9
Neural-ICE/Gemma-4-26B-A4B-it-NVFP4
Text Generation
• 15B • Updated • 805
• 2
437B • Updated • 21
• 2
nvidia/Llama-4-Scout-17B-16E-Instruct-NVFP4
56B • Updated • 63.9k
• 29
nvidia/Phi-4-multimodal-instruct-NVFP4
4B • Updated • 1.96k
• 10
nvidia/Phi-4-reasoning-plus-NVFP4
8B • Updated • 1.48k
• 8
nvidia/Llama-3.1-8B-Instruct-NVFP4
5B • Updated • 110k
• 8
Text Generation
• 5B • Updated • 30.3k
• 16
Text Generation
• 17B • Updated • 143k
• 14
nvidia/Qwen2.5-VL-7B-Instruct-NVFP4
Text Generation
• 5B • Updated • 5.74k
• 14
nvidia/Kimi-K2-Thinking-NVFP4
Text Generation
• Updated • 11k
• 30
nvidia/Qwen3-Next-80B-A3B-Thinking-NVFP4
Text Generation
• Updated • 3.76k
• 55
nvidia/Qwen3-235B-A22B-Thinking-2507-NVFP4
Text Generation
• Updated • 769
• 7
nvidia/Qwen3-235B-A22B-Instruct-2507-NVFP4
Text Generation
• 120B • Updated • 2.8k
• 7