Visual Grounding Adapter

Fine-tuned adapter for Qwen2-VL-7B for visual grounding tasks.

Usage

from transformers import Qwen2VLForConditionalGeneration, AutoProcessor
from peft import PeftModel
import torch

# Load base
model = Qwen2VLForConditionalGeneration.from_pretrained(
    "Qwen/Qwen2-VL-7B-Instruct",
    device_map="auto",
    torch_dtype=torch.float16
)

# Load adapter
model = PeftModel.from_pretrained(model, "YOUR_USERNAME/visual-grounding-adapter")
processor = AutoProcessor.from_pretrained("Qwen/Qwen2-VL-7B-Instruct")

Training

Dataset: Custom diagrams with bounding boxes
LoRA rank: 8-16
Epochs: 2-3
Hardware: Google Colab T4

Downloads last month: 4

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ishanmakkar/visual-grounding-adapter

Base model

Qwen/Qwen2-VL-7B

Finetuned

Qwen/Qwen2-VL-7B-Instruct

Adapter

(212)

this model