Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
diwank
's Collections
M
Text-diffusion
steadytext
world
Med
code
Robotics
reasoning
F
search
Vision
Art
K
S1.1
Sam
Audio
thought
Vision
updated
11 days ago
Upvote
1
apple/DepthPro
Depth Estimation
•
Updated
Feb 28
•
19.5k
•
488
rhymes-ai/Aria
Image-Text-to-Text
•
25B
•
Updated
Apr 23
•
43.2k
•
637
mit-han-lab/hart-0.7b-1024px
Unconditional Image Generation
•
Updated
Nov 17, 2024
•
13
deepseek-ai/Janus-1.3B
Any-to-Any
•
2B
•
Updated
Jan 27
•
7.45k
•
592
bingbangboom/flux-film-camera
Text-to-Image
•
Updated
Nov 15, 2024
•
27
•
•
28
neulab/PangeaInstruct
Updated
Feb 2
•
242
•
86
genmo/mochi-1-preview
Text-to-Video
•
Updated
Sep 4
•
3.6k
•
•
1.3k
stabilityai/stable-diffusion-3.5-large
Text-to-Image
•
Updated
Oct 22, 2024
•
50.6k
•
•
3.26k
Freepik/flux.1-lite-8B-alpha
Text-to-Image
•
Updated
Dec 30, 2024
•
445
•
427
microsoft/OmniParser
Image-Text-to-Text
•
Updated
Dec 2, 2024
•
497
•
1.7k
mistralai/Pixtral-12B-Base-2409
Updated
Jul 28
•
32
•
105
neulab/Pangea-7B
8B
•
Updated
Oct 24, 2024
•
6.28k
•
131
jadechoghari/Ferret-UI-Llama8b
Image-Text-to-Text
•
8B
•
Updated
Jan 8
•
279
•
68
OpenGVLab/InternVL2-1B
Image-Text-to-Text
•
0.9B
•
Updated
Mar 25
•
472k
•
77
OpenGVLab/InternVL2-2B
Image-Text-to-Text
•
2B
•
Updated
Mar 25
•
1.14M
•
76
OpenGVLab/Mono-InternVL-2B
Image-Text-to-Text
•
3B
•
Updated
Jul 22
•
8.68k
•
36
OpenGVLab/OmniCorpus-YT
Updated
Mar 20
•
746
•
13
OpenGVLab/OmniCorpus-CC-210M
Viewer
•
Updated
Mar 20
•
208M
•
252
•
32
OpenGVLab/OmniCorpus-CC
Viewer
•
Updated
Mar 20
•
872M
•
19.6k
•
22
OpenGVLab/InternVideo2_chat_8B_HD
Video-Text-to-Text
•
8B
•
Updated
Dec 18, 2024
•
106
•
18
OpenGVLab/ViCLIP
Updated
Jun 7, 2024
•
45
OpenGVLab/ASMv2
Text Generation
•
Updated
Feb 29, 2024
•
253
•
16
OpenGVLab/VideoChat2-IT
Viewer
•
Updated
Jun 29, 2024
•
1.82M
•
389
•
51
NimVideo/cogvideox-2b-img2vid
Image-to-Video
•
Updated
Oct 28, 2024
•
190
•
80
BAAI/Infinity-MM
Updated
Dec 13, 2024
•
4.96k
•
113
nvidia/RADIO-H
0.7B
•
Updated
Jul 4
•
103
•
10
Spawning/PD12M
Viewer
•
Updated
Jan 9
•
12.4M
•
2.55k
•
170
Shitao/OmniGen-v1
Text-to-Image
•
Updated
Nov 7, 2024
•
1.47k
•
321
InstantX/InstantIR
Image-to-Image
•
Updated
Nov 7, 2024
•
2
•
180
nvidia/Cosmos-0.1-Tokenizer-DI8x8
Updated
Dec 25, 2024
•
151
•
11
BAAI/Emu3-Chat
Text Generation
•
8B
•
Updated
Oct 24, 2024
•
644
•
73
briaai/RMBG-2.0
Image Segmentation
•
0.2B
•
Updated
20 days ago
•
287k
•
•
960
Watermark Anything with Localized Messages
Paper
•
2411.07231
•
Published
Nov 11, 2024
•
21
rain1011/pyramid-flow-miniflux
Text-to-Video
•
Updated
Nov 13, 2024
•
176
OpenGVLab/InternVL2-8B-MPO
Image-Text-to-Text
•
8B
•
Updated
Dec 20, 2024
•
112
•
37
mistralai/Pixtral-Large-Instruct-2411
Updated
Jul 28
•
149
•
426
briaai/BRIA-2.3
Text-to-Image
•
Updated
Apr 10
•
81
•
38
microsoft/Reducio-VAE
Updated
Nov 21, 2024
•
42
•
17
Lightricks/LTX-Video
Image-to-Video
•
Updated
Jul 16
•
282k
•
•
2.06k
apple/aimv2-3B-patch14-448
Image Feature Extraction
•
3B
•
Updated
Jul 8
•
112
•
13
THUdyh/Insight-V-Reason
Text Generation
•
8B
•
Updated
Nov 22, 2024
•
12
•
9
black-forest-labs/FLUX.1-Fill-dev
Updated
Jun 27
•
150k
•
964
Efficient-Large-Model/Sana_1600M_512px
Text-to-Image
•
Updated
Jan 10
•
386
•
39
Efficient-Large-Model/Sana_1600M_1024px
Text-to-Image
•
Updated
Oct 28
•
367
•
•
215
AIDC-AI/Ovis1.6-Gemma2-27B
Image-Text-to-Text
•
29B
•
Updated
Feb 26
•
129
•
62
HuggingFaceTB/SmolVLM-Base
Image-Text-to-Text
•
2B
•
Updated
Nov 28, 2024
•
4.18k
•
84
zai-org/glm-edge-v-5b
Image-Text-to-Text
•
5B
•
Updated
Jan 2
•
129
•
12
rhymes-ai/Aria-Base-64K
Image-Text-to-Text
•
25B
•
Updated
Dec 1, 2024
•
24
•
14
allenai/pixmo-point-explanations
Viewer
•
Updated
Dec 5, 2024
•
79.6k
•
161
•
9
tencent/HunyuanVideo
Text-to-Video
•
Updated
Mar 6
•
1.19k
•
•
2.08k
tencent/HunyuanVideo-PromptRewrite
Updated
Dec 6, 2024
•
220
•
52
google/paligemma2-28b-pt-896
Image-Text-to-Text
•
28B
•
Updated
Dec 5, 2024
•
236
•
50
OpenGVLab/InternVL2_5-78B
Image-Text-to-Text
•
78B
•
Updated
Sep 11
•
521
•
192
MAmmoTH-VL/MAmmoTH-VL-8B
8B
•
Updated
Dec 9, 2024
•
20
•
19
MAmmoTH-VL/MAmmoTH-VL-Instruct-12M
Viewer
•
Updated
Jan 5
•
37M
•
2.34k
•
63
OpenGVLab/PVC-InternVL2-8B
Image-Text-to-Text
•
10B
•
Updated
Dec 17, 2024
•
74
•
9
BGLab/BioTrove
Viewer
•
Updated
Dec 13, 2024
•
163M
•
1.16k
•
17
TencentARC/NVComposer
Image-to-3D
•
Updated
Dec 16, 2024
•
55
•
7
deepseek-ai/deepseek-vl2
Image-Text-to-Text
•
27B
•
Updated
Dec 18, 2024
•
3.9k
•
371
FastVideo/FastHunyuan
Text-to-Video
•
Updated
Jan 8
•
57
•
191
BAAI/nova-d48w1536-sdxl1024
Text-to-Image
•
Updated
Dec 21, 2024
•
14
•
7
IamCreateAI/Ruyi-Mini-7B
Image-to-Video
•
Updated
Dec 25, 2024
•
286
•
610
Infinigence/Megrez-3B-Omni
4B
•
Updated
Feb 14
•
25
•
135
microsoft/VidTok
Updated
Apr 5
•
42
TIGER-Lab/Mantis-8B-siglip-llama3
Image-to-Text
•
8B
•
Updated
Nov 15, 2024
•
492
•
33
OpenGVLab/HoVLE-HD
Image-Text-to-Text
•
3B
•
Updated
Feb 9
•
66
•
8
nyu-visionx/cambrian-34b
Text Generation
•
35B
•
Updated
Jun 28, 2024
•
23
•
27
nyu-visionx/cambrian-phi3-3b
Text Generation
•
4B
•
Updated
Jul 6, 2024
•
217
•
11
nyu-visionx/Cambrian-Alignment
Viewer
•
Updated
Jul 23, 2024
•
292k
•
8.11k
•
38
nvidia/Cosmos-1.0-Autoregressive-13B-Video2World
Updated
Feb 8
•
52
•
32
nvidia/Cosmos-1.0-Diffusion-14B-Video2World
Updated
May 7
•
1.72k
•
56
nvidia/Cosmos-1.0-Diffusion-14B-Text2World
Updated
May 7
•
1.88k
•
60
nvidia/Cosmos-1.0-Autoregressive-12B
Updated
Feb 11
•
43
•
30
StephanST/WALDO30
Object Detection
•
Updated
Jun 23
•
243
ByteDance/Sa2VA-8B
Image-Text-to-Text
•
8B
•
Updated
Sep 8
•
609
•
65
OpenGVLab/VideoChat-Flash-Qwen2_5-2B_res448
Video-Text-to-Text
•
2B
•
Updated
Mar 16
•
2.17k
•
26
OpenGVLab/VideoMAEv2-giant
Video Classification
•
1B
•
Updated
Feb 25
•
4.12k
•
4
MiniMaxAI/MiniMax-VL-01
Image-Text-to-Text
•
456B
•
Updated
Jul 3
•
91.5k
•
280
NimVideo/mochi-1-transformer-42
Text-to-Video
•
Updated
Jan 13
•
29
•
3
ostris/Flex.1-alpha
Text-to-Image
•
Updated
Jan 19
•
1.23k
•
481
tencent/Hunyuan3D-2
Image-to-3D
•
Updated
Oct 17
•
80.7k
•
1.68k
deepseek-ai/Janus-Pro-1B
Any-to-Any
•
Updated
Feb 1
•
8.02k
•
465
deepseek-ai/Janus-Pro-7B
Any-to-Any
•
Updated
Feb 1
•
59.8k
•
3.54k
Qwen/Qwen2.5-VL-72B-Instruct
Image-Text-to-Text
•
73B
•
Updated
Jun 6
•
117k
•
•
569
nvidia/Eagle2-9B
Image-Text-to-Text
•
9B
•
Updated
Jan 28
•
360
•
62
m-a-p/PIN-200M
Viewer
•
Updated
about 13 hours ago
•
68.1k
•
91.4k
•
20
AIDC-AI/Ovis2-34B
Image-Text-to-Text
•
35B
•
Updated
Aug 15
•
52.5k
•
151
microsoft/OmniParser-v2.0
Updated
Mar 28
•
889
•
1.31k
Alpha-VLLM/Lumina-Image-2.0
Text-to-Image
•
Updated
Mar 30
•
1.69k
•
•
348
prithivMLmods/JSONify-Flux
Image-Text-to-Text
•
2B
•
Updated
Feb 16
•
9
•
3
Skywork/SkyReels-V1-Hunyuan-I2V
Image-to-Video
•
Updated
Feb 24
•
597
•
•
274
Skywork/SkyReels-A1
Image-to-Video
•
Updated
Mar 4
•
37
•
64
AIDC-AI/Ovis2-16B
Image-Text-to-Text
•
16B
•
Updated
Aug 15
•
10.6k
•
101
curateIT/themet_openaccess_bestof
Viewer
•
Updated
Apr 7, 2024
•
1.77k
•
15
•
1
MnLgt/yolo-human-parse
Image Classification
•
Updated
Sep 19, 2024
•
27
•
11
google/paligemma2-3b-mix-448
Image-Text-to-Text
•
3B
•
Updated
Feb 7
•
5.62k
•
53
google/paligemma2-28b-mix-448
Image-Text-to-Text
•
28B
•
Updated
Feb 7
•
68
•
27
HuggingFaceTB/SmolVLM2-2.2B-Instruct
Image-Text-to-Text
•
2B
•
Updated
Apr 8
•
145k
•
287
Wan-AI/Wan2.1-T2V-14B
Text-to-Video
•
Updated
Mar 12
•
30.3k
•
•
1.43k
allenai/olmOCR-7B-0225-preview
Image-to-Text
•
8B
•
Updated
Aug 19
•
7.05k
•
703
microsoft/Phi-4-multimodal-instruct
Automatic Speech Recognition
•
6B
•
Updated
May 1
•
396k
•
1.55k
briaai/BRIA-4B-Adapt
Text-to-Image
•
Updated
Jun 11
•
225
•
8
DAMO-NLP-SG/VideoLLaMA3-7B
Video-Text-to-Text
•
8B
•
Updated
Sep 2
•
88.6k
•
71
ali-vilab/ACE_Plus
Updated
Mar 14
•
63
•
293
ByteDance/LatentSync-1.5
Updated
Jun 12
•
79.7k
•
83
IDEA-Research/RexSeek-3B
Image-Text-to-Text
•
4B
•
Updated
Mar 14
•
488
•
10
TIGER-Lab/Vamba-Qwen2-VL-7B
Video-Text-to-Text
•
11B
•
Updated
Mar 18
•
96
•
16
docling-project/SmolDocling-256M-preview
Image-Text-to-Text
•
0.3B
•
Updated
Sep 17
•
126k
•
1.6k
nvidia/Cosmos-Predict1-14B-Video2World
Updated
Apr 8
•
56
•
4
nvidia/Cosmos-Transfer1-7B
Updated
21 days ago
•
1.37k
•
58
CohereLabs/aya-vision-32b
Image-Text-to-Text
•
33B
•
Updated
Oct 30
•
140
•
•
217
ByteDance/Sa2VA-26B
Image-Text-to-Text
•
26B
•
Updated
Sep 8
•
61
•
31
ChaolongYang/KDTalker
Image-to-Video
•
Updated
Mar 30
•
13
Rapidata/OpenAI-4o_t2i_human_preference
Viewer
•
Updated
Mar 28
•
13k
•
509
•
34
McGill-NLP/AURORA
Image-to-Image
•
Updated
Dec 21, 2024
•
18
•
4
HiDream-ai/MotionPro
Image-to-Video
•
Updated
May 27
•
87
RaphaelLiu/Pusa-V0.5
Updated
Jul 23
•
84
•
46
OpenGVLab/InternVL3-38B
Image-Text-to-Text
•
38B
•
Updated
Sep 11
•
55.7k
•
43
ShoufaChen/PixelFlow-Text2Image
Text-to-Image
•
Updated
Apr 12
•
13
FoundationVision/Infinity
Updated
Jun 24
•
46
•
61
nvidia/PhysicalAI-SmartSpaces
Updated
Oct 19
•
6.42k
•
56
nvidia/DAM-3B-Video
Image-Text-to-Text
•
Updated
May 7
•
3.26k
•
57
nvidia/DAM-3B-Self-Contained
Image-Text-to-Text
•
Updated
May 7
•
717
•
24
OpenGVLab/VideoChat-R1_7B
Video-Text-to-Text
•
8B
•
Updated
Apr 22
•
487
•
8
Skywork/SkyCaptioner-V1
Video-Text-to-Text
•
8B
•
Updated
Apr 25
•
300
•
49
Fintor/Fintor-GUI-S2
Image-Text-to-Text
•
8B
•
Updated
Apr 24
•
19
•
4
ByteDance-Seed/UI-TARS-7B-DPO
Image-Text-to-Text
•
8B
•
Updated
Jan 25
•
1.29k
•
221
OpenGVLab/InternVL_2_5_HiCo_R64
Video-Text-to-Text
•
8B
•
Updated
May 13
•
83
•
3
ByteDance/Q-Insight
Updated
May 29
•
15
osunlp/UGround-V1-7B
Image-Text-to-Text
•
8B
•
Updated
Apr 16
•
642
•
19
echo840/MonkeyOCR
Image-Text-to-Text
•
Updated
Aug 28
•
703
•
512
showlab/show-o2-7B
Any-to-Any
•
Updated
Sep 5
•
129
•
15
ETH-CVG/lightglue_disk
Keypoint Detection
•
13.6M
•
Updated
Jul 17
•
9.99k
•
13
TencentARC/ARC-Hunyuan-Video-7B
Video-Text-to-Text
•
9B
•
Updated
Sep 19
•
558
•
30
Skywork/Matrix-3D
Image-to-3D
•
Updated
Sep 2
•
49
LiquidAI/LFM2-VL-1.6B
Image-Text-to-Text
•
2B
•
Updated
5 days ago
•
3.21k
•
212
nvidia/VideoITG-8B
Image-Text-to-Text
•
8B
•
Updated
Aug 13
•
276
•
7
allenai/olmOCR-7B-0725
Image-Text-to-Text
•
8B
•
Updated
Aug 26
•
1.16k
•
62
internlm/Intern-S1-mini
Image-Text-to-Text
•
9B
•
Updated
Oct 31
•
3.42k
•
102
AIDC-AI/Ovis2.5-9B
Image-Text-to-Text
•
9B
•
Updated
Oct 24
•
10.9k
•
298
openbmb/MiniCPM-V-4_5
Image-Text-to-Text
•
9B
•
Updated
Oct 10
•
49.5k
•
1.02k
apple/FastVLM-7B
Text Generation
•
8B
•
Updated
Sep 3
•
737
•
263
apple/MobileCLIP2-S3
Updated
Oct 9
•
53
•
4
apple/MobileCLIP2-S2
Updated
Oct 9
•
91
•
9
inclusionAI/UI-Venus-Ground-72B
Image-Text-to-Text
•
73B
•
Updated
Aug 19
•
706
•
11
PaddlePaddle/PP-OCRv5_mobile_det
Image-to-Text
•
Updated
Jul 22
•
86.2k
•
16
Hcompany/Holo1.5-72B
Image-Text-to-Text
•
73B
•
Updated
Sep 24
•
58
•
25
facebook/map-anything
Image-to-3D
•
0.6B
•
Updated
Sep 22
•
62.4k
•
50
YannQi/R-4B
Image-Text-to-Text
•
5B
•
Updated
Sep 4
•
53.9k
•
172
decart-ai/Lucy-Edit-Dev
Video-to-Video
•
Updated
20 days ago
•
465
•
311
TencentARC/ARC-Qwen-Video-7B-Narrator
Video-Text-to-Text
•
9B
•
Updated
Sep 21
•
47
•
7
manycore-research/SpatialLM1.1-Qwen-0.5B
Text Generation
•
0.6B
•
Updated
Sep 23
•
8.81k
•
25
PerceptronAI/Isaac-0.1
Text Generation
•
3B
•
Updated
Oct 9
•
4.22k
•
112
internlm/CapRL-3B
Image-Text-to-Text
•
4B
•
Updated
Oct 22
•
374
•
45
nvidia/Audio2Face-3D-v3.0
Updated
Oct 21
•
221
•
48
nvidia/nemotron-table-structure-v1
Object Detection
•
Updated
19 days ago
•
153
•
19
datalab-to/chandra
Image-to-Text
•
9B
•
Updated
Oct 21
•
89.3k
•
409
allenai/olmOCR-2-7B-1025
Image-to-Text
•
8B
•
Updated
Oct 22
•
33.5k
•
89
stepfun-ai/GELab-Zero-4B-preview
Image-to-Text
•
4B
•
Updated
9 days ago
•
796
•
92
Upvote
1
Share collection
View history
Collection guide
Browse collections