Tim Dingman's picture

1 58 1

Tim Dingman

tdingman

https://timdingman.com/

AI & ML interests

None yet

Recent Activity

upvoted a paper 7 days ago

DeepSeek-V3.2: Pushing the Frontier of Open Large Language Models

upvoted a paper 24 days ago

ResearchRubrics: A Benchmark of Prompts and Rubrics For Evaluating Deep Research Agents

upvoted a paper 25 days ago

The Path Not Taken: RLVR Provably Learns Off the Principals

View all activity

Organizations

None yet

upvoted a paper 7 days ago

DeepSeek-V3.2: Pushing the Frontier of Open Large Language Models

Paper • 2512.02556 • Published 9 days ago • 197

upvoted a paper 24 days ago

ResearchRubrics: A Benchmark of Prompts and Rubrics For Evaluating Deep Research Agents

Paper • 2511.07685 • Published about 1 month ago • 9

upvoted a paper 25 days ago

The Path Not Taken: RLVR Provably Learns Off the Principals

Paper • 2511.08567 • Published 30 days ago • 32

upvoted 5 papers about 1 month ago

Robot Learning: A Tutorial

Paper • 2510.12403 • Published Oct 14 • 115

Agentic Entropy-Balanced Policy Optimization

Paper • 2510.14545 • Published Oct 16 • 104

DeepSearch: Overcome the Bottleneck of Reinforcement Learning with Verifiable Rewards via Monte Carlo Tree Search

Paper • 2509.25454 • Published Sep 29 • 140

Agent Learning via Early Experience

Paper • 2510.08558 • Published Oct 9 • 266

Tongyi DeepResearch Technical Report

Paper • 2510.24701 • Published Oct 28 • 97

upvoted 5 papers 2 months ago

Scaling Agents via Continual Pre-training

Paper • 2509.13310 • Published Sep 16 • 117

FlowRL: Matching Reward Distributions for LLM Reasoning

Paper • 2509.15207 • Published Sep 18 • 114

ScaleCUA: Scaling Open-Source Computer Use Agents with Cross-Platform Data

Paper • 2509.15221 • Published Sep 18 • 111

LIMI: Less is More for Agency

Paper • 2509.17567 • Published Sep 22 • 102

UI-TARS-2 Technical Report: Advancing GUI Agent with Multi-Turn Reinforcement Learning

Paper • 2509.02544 • Published Sep 2 • 124

upvoted 7 papers 4 months ago

Seed-Prover: Deep and Broad Reasoning for Automated Theorem Proving

Paper • 2507.23726 • Published Jul 31 • 114

Chain-of-Agents: End-to-End Agent Foundation Models via Multi-Agent Distillation and Agentic RL

Paper • 2508.13167 • Published Aug 6 • 129

On the Generalization of SFT: A Reinforcement Learning Perspective with Reward Rectification

Paper • 2508.05629 • Published Aug 7 • 180

Is Chain-of-Thought Reasoning of LLMs a Mirage? A Data Distribution Lens

Paper • 2508.01191 • Published Aug 2 • 238

Qwen-Image Technical Report

Paper • 2508.02324 • Published Aug 4 • 264

Reinforcement Learning with Rubric Anchors

Paper • 2508.12790 • Published Aug 18 • 13

Shortcut Learning in Generalist Robot Policies: The Role of Dataset Diversity and Fragmentation

Paper • 2508.06426 • Published Aug 8 • 10