ChartNet: A Million-Scale, High-Quality Multimodal Dataset for Robust Chart Understanding Paper • 2603.27064 • Published 13 days ago • 24
view article Article Granite 4.0 3B Vision: Compact Multimodal Intelligence for Enterprise Documents 10 days ago • 33
Granite Vision Models Collection Multimodal models built for visual document analysis and image understanding. • 6 items • Updated 8 days ago • 33
GLOV: Guided Large Language Models as Implicit Optimizers for Vision Language Models Paper • 2410.06154 • Published Oct 8, 2024 • 16
Self-MoE: Towards Compositional Large Language Models with Self-Specialized Experts Paper • 2406.12034 • Published Jun 17, 2024 • 16
$\textit{Trans-LoRA}$: towards data-free Transferable Parameter Efficient Finetuning Paper • 2405.17258 • Published May 27, 2024 • 16
LangNav: Language as a Perceptual Representation for Navigation Paper • 2310.07889 • Published Oct 11, 2023 • 6