-
VESPO: Variational Sequence-Level Soft Policy Optimization for Stable Off-Policy LLM Training
Paper • 2602.10693 • Published • 219 -
Flash-KMeans: Fast and Memory-Efficient Exact K-Means
Paper • 2603.09229 • Published • 74 -
DIVE: Scaling Diversity in Agentic Task Synthesis for Generalizable Tool Use
Paper • 2603.11076 • Published • 5
Akki
hc99
AI & ML interests
None yet
Recent Activity
updated a collection 5 days ago
to read updated a collection 6 days ago
to read upvoted an article 22 days ago
GGML and llama.cpp join HF to ensure the long-term progress of Local AI