Interesting new techniques
updated
Self-Play Fine-Tuning Converts Weak Language Models to Strong Language
Models
Paper
• 2401.01335
• Published
• 68
Lumiere: A Space-Time Diffusion Model for Video Generation
Paper
• 2401.12945
• Published
• 87
Adding NVMe SSDs to Enable and Accelerate 100B Model Fine-tuning on a
Single GPU
Paper
• 2403.06504
• Published
• 56
Transformer-Lite: High-efficiency Deployment of Large Language Models on
Mobile Phone GPUs
Paper
• 2403.20041
• Published
• 34
OmniGen: Unified Image Generation
Paper
• 2409.11340
• Published
• 115
Kolmogorov-Arnold Transformer
Paper
• 2409.10594
• Published
• 45
Multimodal Latent Language Modeling with Next-Token Diffusion
Paper
• 2412.08635
• Published
• 49
A3: Android Agent Arena for Mobile GUI Agents
Paper
• 2501.01149
• Published
• 22
Dispider: Enabling Video LLMs with Active Real-Time Interaction via
Disentangled Perception, Decision, and Reaction
Paper
• 2501.03218
• Published
• 35
Sketch-of-Thought: Efficient LLM Reasoning with Adaptive
Cognitive-Inspired Sketching
Paper
• 2503.05179
• Published
• 46
Modifying Large Language Model Post-Training for Diverse Creative
Writing
Paper
• 2503.17126
• Published
• 36
I Have Covered All the Bases Here: Interpreting Reasoning Features in
Large Language Models via Sparse Autoencoders
Paper
• 2503.18878
• Published
• 119
Breaking the Modality Barrier: Universal Embedding Learning with
Multimodal LLMs
Paper
• 2504.17432
• Published
• 40
Parallel Scaling Law for Language Models
Paper
• 2505.10475
• Published
• 83
Paper
• 2505.14674
• Published
• 37
Using Reinforcement Learning to Train Large Language Models to Explain
Human Decisions
Paper
• 2505.11614
• Published
Diffusion vs. Autoregressive Language Models: A Text Embedding
Perspective
Paper
• 2505.15045
• Published
• 55
Learning to Reason Over Time: Timeline Self-Reflection for Improved
Temporal Reasoning in Language Models
Paper
• 2504.05258
• Published
• 1
LoHoVLA: A Unified Vision-Language-Action Model for Long-Horizon
Embodied Tasks
Paper
• 2506.00411
• Published
• 31
Aligning Latent Spaces with Flow Priors
Paper
• 2506.05240
• Published
• 27
Leveraging Self-Attention for Input-Dependent Soft Prompting in LLMs
Paper
• 2506.05629
• Published
• 37
HiWave: Training-Free High-Resolution Image Generation via Wavelet-Based
Diffusion Sampling
Paper
• 2506.20452
• Published
• 19
Lizard: An Efficient Linearization Framework for Large Language Models
Paper
• 2507.09025
• Published
• 19
Hyper-Bagel: A Unified Acceleration Framework for Multimodal
Understanding and Generation
Paper
• 2509.18824
• Published
• 23
Updated
• 12.8M
• 313
ConceptMoE: Adaptive Token-to-Concept Compression for Implicit Compute Allocation
Paper
• 2601.21420
• Published
• 42