TuringPost (Turing Post)

Kseniase

posted an update 2 days ago

Post

2925

15 Outstanding Research Papers from NeurIPS 2025

NeurIPS 2025, as a premier annual event in machine learning and computational neuroscience, tackles major topics like the future of AI, current research, and the most difficult challenges. While we’re not attending this year, we’re closely following the updates and today we pull together a quick, easy-to-digest roundup of a few standout papers so you can jump in without getting overwhelmed.

Here is a list of 15 papers from NeurIPS 2025, including 8 top research papers that received awards, along with 7 others that caught our attention:

1. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks → https://neurips.cc/virtual/2025/loc/san-diego/test-of-time/128328
Test of Time Award winner. Introduces the RPN, a small convnet that predicts objectness and boxes on shared features, enabling Faster R-CNN to share computation and run around 5 fps on a GPU

2. Artificial Hivemind: The Open-Ended Homogeneity of LMs (and Beyond) → https://neurips.cc/virtual/2025/loc/san-diego/poster/121421
Releases a huge open-ended prompt dataset, showing that LLMs often fall into an “artificial hivemind” – generate surprisingly similar answers – and measuring diversity collapse

3. Optimal Mistake Bounds for Transductive Online Learning → https://neurips.cc/virtual/2025/loc/san-diego/poster/119098
Settles a 30-year-old question by showing how much unlabeled data helps in online learning – it gives a precise quadratic advantage with tight matching bounds

4. Gated Attention for LLMs: Non-linearity, Sparsity, and Attention-Sink-Free → https://neurips.cc/virtual/2025/loc/san-diego/poster/120216
Demonstrates how gating actually affects attention: a simple sigmoid gate after Scaled Dot-Product Attention (SDPA) boosts performance, stability, and long-context behavior by adding useful nonlinearity and sparse modulation

Read further below ⬇️
Also, subscribe to the Turing Post: https://www.turingpost.com/subscribe

1 reply

·

Kseniase

posted an update 9 days ago

Post

6162

9 Recent advances in Multi-Agent Systems (all open-source)

The idea to split tasks across multiple agents instead of relying on one universal agent is now seen as one of the most effective ways to build an AI stack. Concepts like “agent swarms” were highlighted at the AI Engineer Code Summit in NYC (Nov 20–21) as the winning architecture. And this trend is not only about coding and software. It applies across all AI domains.

So here is some recent research that helps keep multi-agent systems (MAS) better and up-to-date:

1. LatentMAS → Latent Collaboration in Multi-Agent Systems (2511.20639)
AI agents share their hidden "thoughts" directly in latent space instead of talking through text. This makes collaboration and reasoning way faster and accurate (no extra training needed)

2. Puppeteer → Multi-Agent Collaboration via Evolving Orchestration (2505.19591)
Uses a “puppeteer” LLM that dynamically decides which agents (“puppets”) to call and in what order. By learning this orchestration with reinforcement learning (RL), the system solves complex tasks more efficiently and with fewer compute costs

3. MADD → MADD: Multi-Agent Drug Discovery Orchestra (2511.08217)
A MAS with 4 agents for drug discovery. It lets researchers describe a drug discovery task in plain language. Then MADD automatically builds and runs the full hit-identification pipeline, making AI-driven drug design a simple end-to-end workflow

4. Multi-Agent Tool-Integrated Policy Optimization (MATPO) → Multi-Agent Tool-Integrated Policy Optimization (2510.04678)
Lets one LLM act as multiple agents (like a planner and a worker) by using different prompts and training them together with RL. So you get the benefits of a multi-agent system without needing multiple models

If you're interested in trends in multi-agent for software development of the future, explore my article with the emergent playbook. This is super interesting → https://www.turingpost.com/p/aisoftwarestack
Also, subscribe to the Turing Post: https://www.turingpost.com/subscribe

Read further below ⬇️

2 replies

·

Kseniase

posted an update 16 days ago

Post

1943

6 Essential Reads on Spatial Intelligence

In AI, spatial intelligence is basically the model’s “sense of space” – its ability to understand where things are, how they relate, and how they move. It lets an AI models navigate a room, interpret a scene, or figure out how objects fit together, like giving it a built-in mental map. For example, world models can't live without spatial intelligence.

Here are 6 good reads to explore what spatial intelligence is and how it's evolving:

1. From Words to Worlds: Spatial Intelligence is AI’s Next Frontier by Fei-Fei Li → https://drfeifei.substack.com/p/from-words-to-worlds-spatial-intelligence
Fei-Fei Li, the godmother of AI, is a key figure in spatial intelligence, since her work in computer vision, especially ImageNet, helped AI learn to recognize and understand objects in space. She's recently started a blog, and this post, in particular, argues that true intelligence requires grounding in space, understanding geometry, motion and consequences in the real world

2. Spatial Reasoning in Multimodal LLMs: A Survey of
Tasks, Benchmarks and Methods → https://arxiv.org/abs/2511.15722
Breaks down how AI models handle spatial reasoning from a cognitive angle, maps all the existing tasks and benchmarks to that framework

3. What is Spatial Intelligence? → https://www.turingpost.com/p/cvhistory5
Our special article easily explains what spatial intelligence actually is, why it matters, and how researchers are trying to boost it so machines can better understand and navigate the physical world

4. From 2D to 3D Cognition: A Brief Survey of General World
Models → https://arxiv.org/pdf/2506.20134
Shows how AI world models are evolving from simple 2D perception to full-on 3D understanding, explaining the tech behind it, what new 3D abilities these models gain, and where they’re used in the real world

Read further below ⬇️
If you like it, also subscribe to the Turing Post: https://www.turingpost.com/subscribe

1 reply

·

Kseniase

posted an update 23 days ago

Post

6039

12 Types of JEPA

Since Yann LeCun together with Randall Balestriero released a new paper on JEPA (Joint-Embedding Predictive Architecture), laying out its theory and introducing an efficient practical version called LeJEPA, we figured you might need even more JEPA. Here are 7 recent JEPA variants plus 5 iconic ones:

1. LeJEPA → LeJEPA: Provable and Scalable Self-Supervised Learning Without the Heuristics (2511.08544)
Explains a full theory for JEPAs, defining the “ideal” JEPA embedding as an isotropic Gaussian, and proposes the SIGReg objective to push JEPA toward this ideal, resulting in practical LeJEPA

2. JEPA-T → JEPA-T: Joint-Embedding Predictive Architecture with Text Fusion for Image Generation (2510.00974)
A text-to-image model that tokenizes images and captions with a joint predictive Transformer, enhances fusion with cross-attention and text embeddings before training loss, and generates images by iteratively denoising visual tokens conditioned on text

3. Text-JEPA → Speaking in Words, Thinking in Logic: A Dual-Process Framework in QA Systems (2507.20491)
Converts natural language into first-order logic, with a Z3 solver handling reasoning, enabling efficient, explainable QA with far lower compute than large LLMs

4. N-JEPA (Noise-based JEPA) → Improving Joint Embedding Predictive Architecture with Diffusion Noise (2507.15216)
Connects self-supervised learning with diffusion-style noise by using noise-based masking and multi-level schedules, especially improving visual classification

5. SparseJEPA → SparseJEPA: Sparse Representation Learning of Joint Embedding Predictive Architectures (2504.16140)
Adds sparse representation learning to make embeddings more interpretable and efficient. It groups latent variables by shared semantic structure using a sparsity penalty while preserving accuracy

6. TS-JEPA (Time Series JEPA) → Joint Embeddings Go Temporal (2509.25449)
Adapts JEPA to time-series by learning latent self-supervised representations and predicting future latents for robustness to noise and confounders

Read further below ↓
It you like it, also subscribe to the Turing Post: https://www.turingpost.com/subscribe

1 reply

·

Kseniase

posted an update 30 days ago

Post

4086

7+ Main precision formats used in AI:

Precision is very important in AI as it shapes how accurate and efficient models are. It controls how finely numbers are represented, approximating real-world values with formats like fixed-point and floating-point. A recent BF16 → FP16 study renewed attention to precision impact.
Here are the main precision types used in AI, from full precision for training to ultra-low precision for inference:

1. FP32 (Float32):
Standard full-precision float used in most training: 1 sign bit, 8 exponent bits, 23 mantissa bits. Default for backward-compatible training and baseline numerical stability

2. FP16 (Float16) → https://arxiv.org/abs/2305.10947v6
Half-precision float. It balances accuracy and efficiency. 1 sign bit, 5 exponent bits, 10 mantissa bits. Common on NVIDIA Tensor Cores and mixed-precision setups. There’s now a new wave of using it in reinforcement learning: https://www.turingpost.com/p/fp16

3. BF16 (BFloat16) → https://cloud.google.com/blog/products/ai-machine-learning/bfloat16-the-secret-to-high-performance-on-cloud-tpus
Same dynamic range as FP32 but fewer mantissa bits: 1 sign bit, 8 exponent bits (same as FP32), 7 mantissa bits. It was developed by the research group Google Brain as part of their AI/ML infrastructure work at Google. Preferred on TPUs and modern GPUs

4. FP8 (E4M3 / E5M2) → https://proceedings.neurips.cc/paper_files/paper/2018/file/335d3d1cd7ef05ec77714a215134914c-Paper.pdf
Emerging standard for training and inference on NVIDIA Hopper (H100) and Blackwell (B200) tensor cores and AMD MI300. Also supported in NVIDIA’s Transformer Engine: https://developer.nvidia.com/blog/floating-point-8-an-introduction-to-efficient-lower-precision-ai-training/
E4M3 = 4 exponent, 3 mantissa bits
E5M2 = 5 exponent, 2 mantissa bits

Read further below ⬇️
If you like this, also subscribe to the Turing post: https://www.turingpost.com/subscribe

1 reply

·

Kseniase

posted an update about 1 month ago

Post

11130

11 Fascinating new Policy Optimization techniques

Policy optimization (PO) algorithms are central to training AI models with preference-based feedback. In recent weeks, numerous new PO methods have emerged that build on or replace the popular PPO and GRPO, solving their issues. Here are 11 of them:

1. BAlanced Policy Optimization (BAPO) → BAPO: Stabilizing Off-Policy Reinforcement Learning for LLMs via Balanced Policy Optimization with Adaptive Clipping (2510.18927)
Dynamically adjusting the clipping bounds in PPO-style updates to balance positive and negative gradients and prevent entropy collapse

2. Training-Free GRPO → Training-Free Group Relative Policy Optimization (2510.08191)
Instead of using numeric rewards, it compares rollouts semantically to distill useful knowledge as a token prior, which is then applied during inference to guide the model’s behavior

3. Asymmetric Importance Sampling Policy Optimization (ASPO) → ASPO: Asymmetric Importance Sampling Policy Optimization (2510.06062)
Fixes imbalanced token weighting in LLM training. It flips the importance sampling ratios for positive tokens to correct over- and under-updates, and adds a soft dual-clipping step to keep gradients stable

4. In-Context Steered Policy Optimization (ICPO) → https://arxiv.org/abs/2510.26519
Uses a model’s own in-context learning ability to guide training with existing data. It combines Mixed-Policy GRPO with Implicit Expert Forcing to expand exploration and adds Expert Region Reject Sampling and Annealed Expert-Bonus Reward Shaping to ensure stability and balanced expert influence

5. Graph-Enhanced Policy Optimization (GEPO) → https://arxiv.org/abs/2510.26270
Builds a graph of an agent’s experiences to understand how different states connect, guide exploration and assign rewards more effectively

6. Information Gain-based Policy Optimization (IGPO) → Information Gain-based Policy Optimization: A Simple and Effective Approach for Multi-Turn LLM Agents (2510.14967)
Uses the model’s own belief updates to create dense, informative feedback for smoother multi-turn learning

Read further below ⬇️
If you like this, also subscribe to the Turing post: https://www.turingpost.com/subscribe

2 replies

·

Kseniase

posted an update about 1 month ago

Post

849

12 Awesome GitHub repos to upgrade your AI coding

Coding is the field where AI is welcomed with open arms. Here’s a collection to help you take your AI-assisted coding workflows to the next level of convenience and efficiency:

1. Smol Developer → https://github.com/smol-ai/developer
A lightweight AI “junior dev” that takes your product spec and automatically scaffolds or helps you build full codebases

2. Tabby → https://github.com/TabbyML/tabby
A self-hosted AI coding assistant that runs locally as an alternative to GitHub Copilot. Easy to integrate, GPU-friendly, and doesn’t rely on the cloud

3. Beads (bd) Issue Tracker → https://github.com/steveyegge/beads
Gives coding agents long-term memory, letting them organize, plan, and execute complex tasks reliably across sessions

4. MetaGPT → https://github.com/FoundationAgents/MetaGPT
A multi-agent framework that imitates a software company team using LLMs. It assigns AI agents roles like PM, Architect, and Developer to produce user stories, designs, specs, and final code

5. Open Interpreter → https://github.com/openinterpreter/open-interpreter
Gives you ChatGPT’s coding power with full local control – no limits, no sandbox – so you can automate, analyze, and create anything right from your desktop through a chat interface

6. OpenSpec → https://github.com/Fission-AI/OpenSpec
A lightweight, spec-driven development tool that helps humans and AI agree on what to build before any code is written

7. PR-Agent → https://github.com/qodo-ai/pr-agent
An AI code reviewer that automatically reviews, describes, and improves pull requests across GitHub, GitLab, and other platforms

8. BabyAGI → https://github.com/yoheinakajima/babyagi
A self-building AI framework that gives agents the ability to write, manage, and refine their own functions, turning them from passive tools into active, self-building systems

9 ...⬇️

Subscribe to the Turing Post: https://www.turingpost.com/subscribe – your shortcut to deep, clear AI analysis

2 replies

·

Kseniase

posted an update about 2 months ago

Post

4045

5 Lectures and keynotes defining AI right now

If you want to understand the multifaceted AI landscape in 2025 and see where the field is heading – start with (or revisit) these legendary talks. They can help you capture what’s happening in AI from multiple angles:

1. Andrej Karpathy: Software Is Changing (Again) → https://www.youtube.com/watch?v=LCEmiRjPEtQ
Unveils Software 3.0 – a paradigm where LLMs are the new computers, programmed with prompts instead of code. The key: developers must now master coding, training, and prompting as AI becomes the heart of software building

2. Richard Sutton, The OaK Architecture: A Vision of SuperIntelligence from Experience → https://www.youtube.com/watch?v=gEbbGyNkR2U
Unveils the OaK (Options and Knowledge) architecture – a model-based RL framework for continual intelligence, where every component learns, meta-learns & builds hierarchical abstractions

3. GTC March 2025 Keynote with NVIDIA CEO Jensen Huang → https://www.youtube.com/watch?v=_waPvOwL9Z8
Dives into the accelerated computing and the importance of Physical AI. From the Blackwell GPU architecture & AI factories to breakthroughs in agentic AI & robotics, Jensen Huang explains how NVIDIA aims to power every layer of the AI ecosystem

4. Yann LeCun "Mathematical Obstacles on the Way to Human-Level AI" → https://www.youtube.com/watch?v=ETZfkkv6V7
Yann LeCun always argues we need a new path to machines that reason about the world – not LLMs or RL. So this lecture is about self-supervised systems with world models, planning, memory and energy-based learning

5. Andrew Ng: State of AI Agents → https://www.youtube.com/watch?v=4pYzYmSdSH4
Highlights one of the most pressing topics of 2025 – agents, explaining why most effective AI agents rely on simple, linear workflows built from modular “Lego-brick” tasks + what predicts AI startup success in the new agent era

Subscribe to the Turing Post: https://www.turingpost.com/subscribe –your shortcut to deep, clear AI analysis

Kseniase

posted an update about 2 months ago

Post

3184

9 Powerful AI Video Generation Tools

Since Sora 2 is on fire these weeks, reminding us what high-quality video generation should look like, we decided you really need this list of video generation tools – great alternatives or complements to it.

1. Sora 2 → https://openai.com/sora/
It needs no introduction, but this OpenAI’s text-to-video model produces short, ultra-realistic clips across styles (cinematic, photorealistic, animated, etc.) with synced audio

2. Google Veo 3 (Gemini Video Generation) → https://aistudio.google.com/models/veo-3
Part of Gemini AI. Generates 8-second high-fidelity videos from text or images with native sound: background soundtracks and realistic voices with near-perfect lip sync

3. Runway (Gen-4 by Runway ML) → https://runwayml.com/
Text, image, or video-to-video generation with advanced editing like changing lighting, weather, camera angles or replacing objects. Popular in AI filmmaking

4. Pika Labs → https://pollo.ai/m/pika-ai
Provides creative, often stylized short videos – from cinematic mini-scenes to cartoon-like animations. Ideal for social media clips and visual storytelling. Plus, you can add playful effects to manipulate objects in the generated videos

5. Luma’s Dream Machine → https://lumalabs.ai/dream-machine
Powered by Luma AI’s latest Ray 3 model, it quickly visualizes story ideas, animated concept art, or abstract motion videos. It supports consistent custom characters and seamless looping

Read further below ⬇️
If you like it, also subscribe to the Turing Post https://www.turingpost.com/subscribe

1 reply

·

Kseniase

posted an update 2 months ago

Post

3824

8 Emerging trends in Reinforcement Learning

Reinforcement learning is having a moment - and not just this week. Some of its directions are already showing huge promise, while others are still early but exciting. Here’s a look at what’s happening right now in RL:

1. Reinforcement Pre-Training (RPT) → Reinforcement Pre-Training (2506.08007)
Reframes next-token pretraining as RL with verifiable rewards, yielding scalable reasoning gains

2. Reinforcement Learning from Human Feedback (RLHF) → Deep reinforcement learning from human preferences (1706.03741)
The top approach. It trains a model using human preference feedback, building a reward model and then optimizing the policy to generate outputs people prefer

3. Reinforcement Learning with Verifiable Rewards (RLVR) → Reinforcement Learning with Verifiable Rewards Implicitly Incentivizes Correct Reasoning in Base LLMs (2506.14245)
Moves from subjective (human-labeled) rewards to objective ones that can be automatically verified, like in math, code, or rubrics as reward, for example → Reinforcement Learning with Rubric Anchors (2508.12790), Rubrics as Rewards: Reinforcement Learning Beyond Verifiable Domains (2507.17746)

4. Multi-objective RL → Pareto Multi-Objective Alignment for Language Models (2508.07768)
Trains LMs to balance multiple goals at once, like being helpful but also concise or creative, ensuring that improving one goal doesn’t ruin another

5. Parallel thinking RL → Parallel-R1: Towards Parallel Thinking via Reinforcement Learning (2509.07980)
Trains parallel chains of thought, boosting math accuracy and final ceilings. It first teaches the model “parallel thinking” skill on easier problems, then uses RL to refine it on harder ones

Read further below ⬇️
And if you like this, subscribe to the Turing post: https://www.turingpost.com/subscribe

Also, check out our recent guide about the past, present and future of RL: https://www.turingpost.com/p/rlguide

3 replies

·

Kseniase

posted an update 2 months ago

Post

4582

12 Excellent MCP Servers

The family of MCP (Model Context Protocol) servers keeps expanding to bridge agents, models, tools, web, data and apps. Here are 12 useful MCP servers that will help you create convenient agentic ecosystems:

1. Chrome DevTools MCP → https://github.com/ChromeDevTools/chrome-devtools-mcp
Lets your coding agent (Gemini, Claude, Cursor, Copilot) control a live Chrome browser with full DevTools access for automation, debugging, and performance analysis

2. Windows-MCP → https://github.com/CursorTouch/Windows-MCP
Provides interaction between agents and Windows, handling file navigation, app control, UI actions, QA testing

3. MCPControl → https://github.com/claude-did-this/MCPControl
Windows control server for programmatic control of mouse, keyboard, window management, and screen capture

4. MetaMCP → https://github.com/metatool-ai/metamcp
A proxy that aggregates multiple MCP servers into one, with middleware support. Works as a standard MCP server for any client

5. MindsDB → https://github.com/mindsdb/mindsdb
Humans, models, agents and apps get accurate answers from large-scale data sources

6. Playwright MCP → https://github.com/microsoft/playwright-mcp
Lets LLMs interact with web pages via structured accessibility snapshots, no need for screenshots or visually-tuned models

7. MCP Access Point → https://github.com/sxhxliang/mcp-access-point
Bridges MCP clients with HTTP services, no server-side changes needed

8. Browserbase MCP Server → https://github.com/browserbase/mcp-server-browserbase
Connects LLMs to external data and tools, adding cloud browser automation via Browserbase and Stagehand. It enables LLMs to browse, capture, extract, and act on web pages with precision

9. Yutu → https://github.com/eat-pray-ai/yutu
Automates YouTube workflows, managing videos, playlists, channels, comments, captions, etc.

3 more below ↓
Also, subscribe to the Turing Post: https://www.turingpost.com/subscribe

1 reply

·

Kseniase

posted an update 3 months ago

Post

6150

10 awesome advanced LoRA approaches

Low-Rank Adaptation (LoRA) is the go-to method for efficient model fine-tuning that adds small low-rank matrices instead of retraining full models. The field isn’t standing still – new LoRA variants push the limits of efficiency, generalization, and personalization. So we’re sharing 10 of the latest LoRA approaches you should know about:

1. Mixture-of-LoRA-experts → Mixture of Low-Rank Adapter Experts in Generalizable Audio Deepfake Detection (2509.13878)
Adds multiple low-rank adapters (LoRA) into a model’s layers, and a routing mechanism activates the most suitable ones for each input. This lets the model adapt better to new unseen conditions

2. Amortized Bayesian Meta-Learning for LoRA (ABMLL) → Amortized Bayesian Meta-Learning for Low-Rank Adaptation of Large Language Models (2508.14285)
Balances global and task-specific parameters within a Bayesian framework to improve uncertainty calibration and generalization to new tasks without high memory or compute costs

3. AutoLoRA → AutoLoRA: Automatic LoRA Retrieval and Fine-Grained Gated Fusion for Text-to-Image Generation (2508.02107)
Automatically retrieves and dynamically aggregates public LoRAs for stronger T2I generation

4. aLoRA (Activated LoRA) → Activated LoRA: Fine-tuned LLMs for Intrinsics (2504.12397)
Only applies LoRA after invocation, letting the model reuse the base model’s KV cache instead of recomputing the full turn’s KV cache. Efficient in multi-turn conversations

5. LiLoRA (LoRA in LoRA) → LoRA in LoRA: Towards Parameter-Efficient Architecture Expansion for Continual Visual Instruction Tuning (2508.06202)
Shares the LoRA matrix A across tasks and additionally low-rank-decomposes matrix B to cut parameters in continual vision-text MLLMs

6. Sensitivity-LoRA → Sensitivity-LoRA: Low-Load Sensitivity-Based Fine-Tuning for Large Language Models (2509.09119)
Dynamically assigns ranks to weight matrices based on their sensitivity, measured using second-order derivatives

Read further below ↓
Also, subscribe to the Turing Post: https://www.turingpost.com/subscribe

3 replies

·

Kseniase

posted an update 3 months ago

Post

6350

6 Recent & free sources to master Reinforcement Learning

Almost every week new research and resources on RL come out. Knowledge needs to be constantly refreshed and updated with the latest trends. So today, we’re sharing 6 free sources to help you stay on track with RL:

1. A Survey of Continual Reinforcement Learning → https://arxiv.org/abs/2506.21872
Covers continual RL (CRL): how agents can keep learning and adapt to new tasks without forgetting past ones. It analyses methods, benchmarks, evaluation metrics &challenges

2. The Deep Reinforcement Learning course by Hugging Face → https://huggingface.co/learn/deep-rl-course/unit0/introduction
This is a popular free course, regularly updated. Includes community interaction, exercises, leaderboards, etc.

3. Reinforcement Learning Specialization (Coursera, University of Alberta) → https://www.coursera.org/specializations/reinforcement-learning
A 4-course series introducing foundational RL, implementing different algorithms, culminating in a capstone. It's a great structured path

4. A Technical Survey of Reinforcement Learning Techniques for LLMs → A Technical Survey of Reinforcement Learning Techniques for Large Language Models (2507.04136)
Looks at how RL is being used for/with LLMs for alignment, reasoning, preference signals, etc. Covers methods like RLHF, RLAIF, DPO, PPO, GRPO & applications from code gen to tool use

5. A Survey of Reinforcement Learning for Software Engineering → https://arxiv.org/abs/2507.12483
Good if you're interested in RL-applied domains. Examines how RL is used in software engineering tasks: maintenance, development, evaluation. Covering 115 papers since DRL introduction, it summarizes trends, gaps & challenges

6. A Survey of Reinforcement Learning for LRMs → https://arxiv.org/abs/2509.08827
Tracks the way from LLMs to LRMs via RL. Covers reward design, policy optimization, use cases and future approaches like continual, memory, model-based RL and more

If you liked this, subscribe to The Turing Post https://www.turingpost.com/subscribe

1 reply

·

Kseniase

posted an update 3 months ago

Post

7042

10 Latest Preference Optimization Techniques

Models need feedback on what makes outputs “good” or “bad.” Policy optimization (PO) turns preferences and rewards into actual training signals. This field is evolving quickly, moving far beyond classics like PPO and GRPO. So here is our overview of 10 newest PO methods:

1. Pref-GRPO → Pref-GRPO: Pairwise Preference Reward-based GRPO for Stable Text-to-Image Reinforcement Learning (2508.20751)
Stabilizes text-to-image reinforcement learning (RL) with pairwise preference rewards and a unified UNIGENBENCH benchmark

2. PVPO (Policy with Value Preference Optimization) → PVPO: Pre-Estimated Value-Based Policy Optimization for Agentic Reasoning (2508.21104)
This critic-free RL method uses a pre-trained model as a reference anchor to reduce bias and guide learning, selecting high-value examples through data pre-sampling

3. DCPO (Dynamic Clipping Policy Optimization) → DCPO: Dynamic Clipping Policy Optimization (2509.02333)
Uses dynamic clipping, which adjusts probability limits per token for better token exploration, and smooth reward standardization to balance rewards over training steps and prevent wasted updates

4. ARPO (Agentic Reinforced Policy Optimization) → Agentic Reinforced Policy Optimization (2507.19849)
Optimizes multi-turn LLM agents that use external tools. It uses an entropy-based adaptive rollout to explore post-tool use and an advantage attribution method to better assign credit across steps, leading to more efficient tool use with fewer resources

5. GRPO-RoC (Group Relative Policy Optimization with Resampling-on-Correct) → rStar2-Agent: Agentic Reasoning Technical Report (2508.20722)
Oversamples rollouts, then resamples them to keep diverse mistakes and only the highest-quality correct answers. It reduces noises and ends up with stronger reasoning in a code environment

Read further below ⬇️
If you like this, also subscribe to the Turing post: https://www.turingpost.com/subscribe

1 reply

·

Kseniase

posted an update 3 months ago

Post

474

11 Powerful Image Models

Everyone is buzzing around image generation this week, or more specifically, Google's Nano-Banana. So today we want to share a list of models that can be your great toolkit for image generation + editing + multi-turn refinement.

1. Gemini 2.5 Flash Image, or Nano-Banana →
https://deepmind.google/models/gemini/image/
Google’s newest image model with conversational editing, character consistency, and multi-image fusion. Available in AI Studio and the Gemini API. Price: $2.50 per 1M tokens

2. FLUX (Black Forest Labs) → https://bfl.ai/
A family of models known for rich detail and, excellent prompt adherence, and fast iterative generation. Offered in several variants, from Pro to open-source, it's accessible via Hugging Face, Replicate, Azure AI Foundry, etc., and used as a base in many pipelines. Price: $0.025-0.08 per image

3. Midjourney v7 → https://www.midjourney.com/
Enhanced image fidelity, prompt comprehension, and anatomical coherence (hands, bodies, objects) + provides a smart lightbox editor. The Omni-reference tool improves character and object consistency in your images. It remains accessible via Discord with a supporting web interface. Price: $10-60/month

4. Stable Diffusion 3.5 (Stability AI) → https://stability.ai/stable-image
Open-weights line with improved text rendering, photorealism, and
prompt adherence compared to earlier versions. It introduces technical innovations through its MMDiT architecture. Price: $0.025-0.065 per image

5. OpenAI GPT-Image-1 →https://platform.openai.com/docs/guides/image-generation?image-generation-model=gpt-image-1
It's the same multimodal model that powers ChatGPT's image capabilities, offering high-fidelity image generation, precise edits, including inpainting, and accurate text rendering. Available via the Images API. Price: $40 per 1M tokens

Read further below ⬇️
If you like this, also subscribe to the Turing post: https://www.turingpost.com/subscribe

1 reply

·

Kseniase

posted an update 4 months ago

Post

3679

6 Must-read books about AI and Machine Learning:

Sharing some free, useful resources for you. In this collection, we’ve gathered the most recent books to give you up-to-date information on key fundamental topics. Hope this helps you master AI and machine learning:

1. Machine Learning Systems by Vijay Janapa Reddi → https://www.mlsysbook.ai/
Provides a framework for building effective ML solutions, covering data engineering, optimization, hardware-aware training, inference acceleration, architecture choice, and other key principles

2. Generative Diffusion Modeling: A Practical Handbook by Zihan Ding, Chi Jin → https://arxiv.org/abs/2412.17162
Offers a unified view of diffusion models: probabilistic, score-based, consistency, rectified flow, pre/post-training. It aligns notations with code to close the “paper-to-code” gap.

3. Geometric Deep Learning: Grids, Groups, Graphs, Geodesics, and Gauges → https://arxiv.org/abs/2104.13478
Explores unified geometric principles to analyze neural networks' architectures: CNNs, RNNs, GNNs, Transformers, and guide the design of the future ones

4. Mathematical Foundations of Geometric Deep Learning by Haitz Saez de Ocariz Borde and Michael Bronstein → https://arxiv.org/abs/2508.02723
Dives into the the key math concepts behind geometric Deep Learning: geometric and analytical structures, vector calculus, differential geometry, etc.

5. Interpretable Machine Learning by Christoph Molnar → https://github.com/christophM/interpretable-ml-book
Practical guide to simple, transparent models (e.g., decision trees) and model-agnostic methods like LIME, Shapley values, permutation importance, and accumulated local effects.

6. Understanding Deep Learning by Simon J.D. Prince → https://udlbook.github.io/udlbook/
Explores core deep learning concenpts: models, training, evaluation, RL, architectures for images, text, and graphs, addressing open theoretical questions

Also, subscribe to the Turing Post: https://www.turingpost.com/subscribe

2 replies

·

Kseniase

posted an update 4 months ago

Post

3489

12 Powerful World Models

World models are one of the most challenging areas in AI, pushing the boundaries of reasoning, perception, and planning. They're gen AI systems that help models and agents learn internal representations of real-world environments.

Today, we invite you to take a look at 12 standout examples:

1. WorldVLA → WorldVLA: Towards Autoregressive Action World Model (2506.21539)
This autoregressive world model integrates action prediction and visual world modeling in a single framework, allowing each to enhance the other. It introduces an attention masking strategy to reduce action prediction errors

2. SimuRA → https://arxiv.org/abs/2507.23773
A generalized world model that uses a language-based world model to simulate and plan actions before execution, enabling more general and flexible reasoning

3. PAN (Physical, Agentic, and Nested) world models → Critiques of World Models (2507.05169)
Has a hybrid architecture that combines discrete concept-based reasoning (via LLMs) with continuous perceptual simulation (via diffusion models), enabling rich multi-level, multimodal understanding and prediction

4. MineWorld by Microsoft Research → MineWorld: a Real-Time and Open-Source Interactive World Model on Minecraft (2504.08388)
Enables real-time, interactive world modeling in Minecraft by combining visual and action tokenization within an autoregressive Transformer. It uses parallel decoding for fast scene generation (4–7 FPS)

5. WorldMem → WORLDMEM: Long-term Consistent World Simulation with Memory (2504.12369)
Uses a memory bank with attention over time-stamped frames and states to maintain long-term and 3D spatial consistency in scene generation. So it reconstruct past scenes and simulate dynamic world changes across large temporal gaps

Read further below ⬇️

If you like this, also subscribe to the Turing post: https://www.turingpost.com/subscribe

Plus explore this article for a comprehensive overview of the history and current evolution of world models: https://www.turingpost.com/p/topic-35-what-are-world-models

1 reply

·

Kseniase

posted an update 4 months ago

Post

5103

9 new policy optimization techniques

Reinforcement Learning (RL) won't stuck in the same old PPO loop - in the last two months alone, researchers have introduced a new wave of techniques, reshaping how we train and fine-tune LLMs, VLMs, and agents.

Here are 9 fresh policy optimization techniques worth knowing:

1. GSPO: Group Sequence Policy Optimization → Group Sequence Policy Optimization (2507.18071)
Shifts from token-level to sequence-level optimization, clipping, and rewarding to capture the full picture and increase stability compared to GRPO. GSPO-token variation also allows token-level fine-tuning.

2. LAPO: Length-Adaptive Policy Optimization → LAPO: Internalizing Reasoning Efficiency via Length-Adaptive Policy Optimization (2507.15758)
A two-stage RL framework that trains models to adaptively control reasoning length by learning typical solution lengths for shorter and more efficient reasoning.

3. HBPO: Hierarchical Budget Policy Optimization → Hierarchical Budget Policy Optimization for Adaptive Reasoning (2507.15844)
This one trains model to adapt reasoning depth based on problem complexity. It divides training samples into subgroups with different token budgets, using budget-aware rewards to align reasoning effort with task difficulty.

4. SOPHIA: Semi-off-policy reinforcement learning → Semi-off-Policy Reinforcement Learning for Vision-Language Slow-thinking Reasoning (2507.16814)
Combines on-policy visual understanding from the Vision Language Models (VLMs) with off-policy reasoning from an LM, assigning outcome-based rewards and propagating visual rewards backward through the reasoning steps.

5. RePO: Replay-Enhanced Policy Optimization → RePO: Replay-Enhanced Policy Optimization (2506.09340)
Introduces a replay buffer into on-policy RL for LLMs, retrieving diverse off-policy samples for each prompt to broaden the training data per prompt

Read further below ⬇️
If you like it, also subscribe to the Turing Post: https://www.turingpost.com/subscribe

1 reply

·

Kseniase

posted an update 5 months ago

Post

6268

6 Essential Reads on core AI/ML topics:

Time to look at some free useful resources that can help you upgrade your knowledge of AI and machine learning!
Today we offer you these 6 must-read surveys that can be your perfect guides to the major fields and techniques:

1. Foundations of Large Language Models by Tong Xiao and Jingbo Zhu → https://arxiv.org/abs/2501.09223
Many recommend this 270-page book as a good resource to focus on fundamental concepts, such as pre-training, generative models, prompting, alignment, and inference

2. Large Language Models Post-Training: Surveying Techniques from Alignment to Reasoning -> A Survey on Post-training of Large Language Models (2503.06072)
Read this to master policy optimization (RLHF, DPO, GRPO), supervised and parameter-efficient fine-tuning, reasoning, integration, and adaptation techniques

3. Agentic Large Language Models, a survey by Leiden University → https://arxiv.org/abs/2503.23037
Surveys agentic LLMs across reasoning, tools, and multi-agent collaboration, highlighting their synergy. It also explores their promise, risks and applications in medicine, finance, science.

4. A Survey of Context Engineering for Large Language Models → A Survey of Context Engineering for Large Language Models (2507.13334)
Defines Context Engineering as systematic info design for LLMs beyond prompting, covering retrieval, processing, management, and architectures like RAG and multi-agent systems

5. A Survey of Generative Categories and Techniques in Multimodal Large Language Models → https://arxiv.org/abs/2506.10016
Covers multimodal models, exploring six generative modalities, key techniques (SSL, RLHF, CoT), architectural trends, and challenges

6. Large Language models for Time Series Analysis: Techniques, Applications, and Challenges → https://arxiv.org/abs/2506.11040
Explains how LLMs transform time series analysis by enhancing pattern recognition and long-term dependency handling + shows how to build them

Also, subscribe to the Turing Post: https://www.turingpost.com/subscribe

1 reply

·

Kseniase

posted an update 5 months ago

Post

5213

13 New types of LoRA

LoRA (Low-Rank Adaptation) is a popular lightweight method for fine-tuning AI models. It doesn't update the full model, it adds small trainable components, low-rank matrices, while keeping the original weights frozen. Only these adapters are trained.

Recently, many interesting new LoRA variations came out, so it’s a great time to take a look at these 13 clever approaches:

1. T-LoRA → T-LoRA: Single Image Diffusion Model Customization Without Overfitting (2507.05964)
A timestep-dependent LoRA method for adapting diffusion models with a single image. It dynamically adjusts updates and uses orthogonal initialization to reduce overlap, achieving better fidelity–alignment balance than standard LoRA

2. SingLoRA → SingLoRA: Low Rank Adaptation Using a Single Matrix (2507.05566)
Simplifies LoRA by using only one small matrix instead of usual two, and multiplying it by its own transpose (like A × Aᵀ). It uses half the parameters of LoRA and avoids scale mismatch between different matrices

3. LiON-LoRA → LiON-LoRA: Rethinking LoRA Fusion to Unify Controllable Spatial and Temporal Generation for Video Diffusion (2507.05678)
Improves control and precision in video diffusion models when training data is limited. It builds on LoRA, adding 3 key principles: linear scalability, orthogonality, and norm consistency. A controllable token and modified self-attention enables smooth adjustment of motion

4. LoRA-Mixer → LoRA-Mixer: Coordinate Modular LoRA Experts Through Serial Attention Routing (2507.00029)
Combines LoRA and mixture-of-experts (MoE) to adapt LLMs for multiple tasks. It dynamically routes task-specific LoRA experts into linear projections of attention modules, supporting both joint training and frozen expert reuse

5. QR-LoRA → QR-LoRA: Efficient and Disentangled Fine-tuning via QR Decomposition for Customized Generation (2507.04599)
Separates content and style when combining multiple LoRA adapters. It implements QR decomposition to structure parameter updates, where the orthogonal Q matrix reduces interference between features, and the R matrix captures specific transformations

Read further in the comments 👇

If you like it, also subscribe to the Turing Post: https://www.turingpost.com/subscribe

1 reply

·

Turing Post

AI & ML interests

Articles

FOD#93: When AI meant Ambient Intelligence

🎙️🧩 TP/Inference: Sharon Zhou on AI Hallucinations, Agents Hype, and Giving Developers the Keys to GenAI

🌁#86: Four Freedoms of truly open AI

AI & ML interests

Articles

FOD#93: When AI meant Ambient Intelligence

🎙️🧩 TP/Inference: Sharon Zhou on AI Hallucinations, Agents Hype, and Giving Developers the Keys to GenAI

🌁#86: Four Freedoms of truly open AI

Team members 3

TuringPost's activity