John Pope's picture

John Pope

johndpope

·

johndpope

AI & ML interests

None yet

Recent Activity

liked a model 1 day ago

liu-hanghang/jam-1-ckpts

liked a model 1 day ago

declare-lab/JAM-0.5

new activity 3 days ago

unsloth/Qwen-Image-Layered-GGUF:Diffusers run gguf?

View all activity

Organizations

upvoted a paper 22 days ago

PersonaLive! Expressive Portrait Image Animation for Live Streaming

Paper • 2512.11253 • Published 26 days ago • 34

upvoted 4 papers about 1 month ago

Decoupled DMD: CFG Augmentation as the Spear, Distribution Matching as the Shield

Paper • 2511.22677 • Published Nov 27, 2025 • 29

RELIC: Interactive Video World Model with Long-Horizon Memory

Paper • 2512.04040 • Published Dec 3, 2025 • 23

Z-Image: An Efficient Image Generation Foundation Model with Single-Stream Diffusion Transformer

Paper • 2511.22699 • Published Nov 27, 2025 • 224

STARFlow-V: End-to-End Video Generative Modeling with Normalizing Flow

Paper • 2511.20462 • Published Nov 25, 2025 • 31

upvoted 4 papers about 2 months ago

MMaDA-Parallel: Multimodal Large Diffusion Language Models for Thinking-Aware Editing and Generation

Paper • 2511.09611 • Published Nov 12, 2025 • 69

PhysX-Anything: Simulation-Ready Physical 3D Assets from Single Image

Paper • 2511.13648 • Published Nov 17, 2025 • 52

Back to Basics: Let Denoising Generative Models Denoise

Paper • 2511.13720 • Published Nov 17, 2025 • 67

Depth Anything 3: Recovering the Visual Space from Any Views

Paper • 2511.10647 • Published Nov 13, 2025 • 96

upvoted a collection about 2 months ago

Vision Language Models: 2025 Update

This collection includes all the models, datasets and Spaces mentioned in the blog Vision Language Models: 2025 Update • 67 items • Updated May 12, 2025 • 6

upvoted 7 papers 2 months ago

Less is More: Recursive Reasoning with Tiny Networks

Paper • 2510.04871 • Published Oct 6, 2025 • 501

MIRO: MultI-Reward cOnditioned pretraining improves T2I quality and efficiency

Paper • 2510.25897 • Published Oct 29, 2025 • 16

TradingAgents: Multi-Agents LLM Financial Trading Framework

Paper • 2412.20138 • Published Dec 28, 2024 • 15

OmniFlatten: An End-to-end GPT Model for Seamless Voice Conversation

Paper • 2410.17799 • Published Oct 23, 2024 • 7

Pico-Banana-400K: A Large-Scale Dataset for Text-Guided Image Editing

Paper • 2510.19808 • Published Oct 22, 2025 • 29

RAP: Real-time Audio-driven Portrait Animation with Video Diffusion Transformer

Paper • 2508.05115 • Published Aug 7, 2025 • 3

Wan-S2V: Audio-Driven Cinematic Video Generation

Paper • 2508.18621 • Published Aug 26, 2025 • 20

upvoted 3 papers 3 months ago

VASA-1: Lifelike Audio-Driven Talking Faces Generated in Real Time

Paper • 2404.10667 • Published Apr 16, 2024 • 24

Dawn of the transformer era in speech emotion recognition: closing the valence gap

Paper • 2203.07378 • Published Mar 14, 2022 • 2

Lyra: Generative 3D Scene Reconstruction via Video Diffusion Model Self-Distillation

Paper • 2509.19296 • Published Sep 23, 2025 • 23