Model Card for Model ID

This is a int4_awq quantized checkpoint of bigcode/starcoder2-15b. It takes about 10GB of VRAM.

Running this Model

Run via docker with text-generation-interface:

docker run --gpus all --shm-size 64g -p 8080:80 -v ~/.cache/huggingface:/data \
    ghcr.io/huggingface/text-generation-inference:3.1.0 \
    --model-id shavera/starcoder2-15b-w4-autoawq-gemm
Downloads last month
194
Safetensors
Model size
16B params
Tensor type
F32
I32
F16
Inference Providers NEW
This model isn't deployed by any Inference Provider. 馃檵 Ask for provider support

Model tree for shavera/starcoder2-15b-w4-autoawq-gemm

Quantized
(19)
this model