Börje Karlsson's picture

Börje Karlsson

tellarin

·

https://tellarin.com/borje/

AI & ML interests

Machine Learning Systems, Mobile Sensing, Knowledge Mining, Digital Entertainment

Recent Activity

upvoted a paper about 9 hours ago

SeeNav-Agent: Enhancing Vision-Language Navigation with Visual Prompt and Step-Level Policy Optimization

upvoted a paper about 9 hours ago

TV2TV: A Unified Framework for Interleaved Language and Video Generation

upvoted a paper about 9 hours ago

SIMA 2: A Generalist Embodied Agent for Virtual Worlds

View all activity

Organizations

upvoted 11 papers about 9 hours ago

SeeNav-Agent: Enhancing Vision-Language Navigation with Visual Prompt and Step-Level Policy Optimization

Paper • 2512.02631 • Published 9 days ago • 8

TV2TV: A Unified Framework for Interleaved Language and Video Generation

Paper • 2512.05103 • Published 7 days ago • 14

SIMA 2: A Generalist Embodied Agent for Virtual Worlds

Paper • 2512.04797 • Published 7 days ago • 19

ProPhy: Progressive Physical Alignment for Dynamic World Simulation

Paper • 2512.05564 • Published 6 days ago • 4

COOPER: A Unified Model for Cooperative Perception and Reasoning in Spatial Intelligence

Paper • 2512.04563 • Published 7 days ago • 13

Embodied Referring Expression Comprehension in Human-Robot Interaction

Paper • 2512.06558 • Published 5 days ago • 2

VideoVLA: Video Generators Can Be Generalizable Robot Manipulators

Paper • 2512.06963 • Published 4 days ago • 3

Ground Slow, Move Fast: A Dual-System Foundation Model for Generalizable Vision-and-Language Navigation

Paper • 2512.08186 • Published 2 days ago • 17

MIND-V: Hierarchical Video Generation for Long-Horizon Robotic Manipulation with RL-based Physical Alignment

Paper • 2512.06628 • Published 4 days ago • 12

OneStory: Coherent Multi-Shot Video Generation with Adaptive Memory

Paper • 2512.07802 • Published 3 days ago • 36

Reflection Removal through Efficient Adaptation of Diffusion Transformers

Paper • 2512.05000 • Published 7 days ago • 14

upvoted a paper 2 days ago

WALL-E 2.0: World Alignment by NeuroSymbolic Learning improves World Model-based LLM Agents

Paper • 2504.15785 • Published Apr 22 • 22

upvoted 2 papers 13 days ago

What does it mean to understand language?

Paper • 2511.19757 • Published 17 days ago • 9

Agent0-VL: Exploring Self-Evolving Agent for Tool-Integrated Vision-Language Reasoning

Paper • 2511.19900 • Published 16 days ago • 46

upvoted 2 papers 15 days ago

PhysX-Anything: Simulation-Ready Physical 3D Assets from Single Image

Paper • 2511.13648 • Published 24 days ago • 52

GigaWorld-0: World Models as Data Engine to Empower Embodied AI

Paper • 2511.19861 • Published 16 days ago • 30

upvoted 3 papers 16 days ago

Chain-of-Visual-Thought: Teaching VLMs to See and Think Better with Continuous Visual Tokens

Paper • 2511.19418 • Published 17 days ago • 27

In-Video Instructions: Visual Signals as Generative Control

Paper • 2511.19401 • Published 17 days ago • 30

Computer-Use Agents as Judges for Generative User Interface

Paper • 2511.15567 • Published 22 days ago • 51

upvoted a paper about 1 month ago

π_RL: Online RL Fine-tuning for Flow-based Vision-Language-Action Models

Paper • 2510.25889 • Published Oct 29 • 64