Cosmos3 Collection Omnimodal World Models for Physical AI • 15 items • Updated about 2 hours ago • 70
view article Article Welcome NVIDIA Cosmos 3: The First Open Omni-model for Physical AI Reasoning and Action nvidia • 3 days ago • 64
Qwen-VLA: Unifying Vision-Language-Action Modeling across Tasks, Environments, and Robot Embodiments Paper • 2605.30280 • Published 7 days ago • 136
OSP-Next: Efficient High-Quality Video Generation with Sparse Sequence Parallelism, HiF8 Quantization, and Reinforcement Learning Paper • 2605.28691 • Published 8 days ago • 24
Causal Forcing++: Scalable Few-Step Autoregressive Diffusion Distillation for Real-Time Interactive Video Generation Paper • 2605.15141 • Published 21 days ago • 93
HumanNet: Scaling Human-centric Video Learning to One Million Hours Paper • 2605.06747 • Published 28 days ago • 52
view article Article Nucleus-Image: Scaling Text-to-Image with Sparse Mixture of Experts NucleusAI • Apr 14 • 11
OmniShow: Unifying Multimodal Conditions for Human-Object Interaction Video Generation Paper • 2604.11804 • Published Apr 13 • 72
view article Article `LeRobotDataset:v3.0`: Bringing large-scale datasets to `lerobot` +9 fracapuano, aractingi, lhoestq, CarolinePascal, pepijn223, jadechoghari, cadene, aliberts, AdilZtn, nepyope, imstevenpmwork • Sep 16, 2025 • 56
SpecEyes: Accelerating Agentic Multimodal LLMs via Speculative Perception and Planning Paper • 2603.23483 • Published Mar 24 • 63
Manifold-Aware Exploration for Reinforcement Learning in Video Generation Paper • 2603.21872 • Published Mar 23 • 34
Astrolabe: Steering Forward-Process Reinforcement Learning for Distilled Autoregressive Video Models Paper • 2603.17051 • Published Mar 17 • 109
view article Article Introducing Modular Diffusers - Composable Building Blocks for Diffusion Pipelines +2 YiYiXu, OzzyGT, dn6, sayakpaul • Mar 5 • 51