darrenkwlai
/

Llama-3.1-8B-Instruct-dl-MLX-4bit

4-bit precision

Model card Files Files and versions

dl-Llama-3.1-8B-Instruct-MLX-4bit

This is an MLX-converted version of Meta's Llama 3.1-8B Instruct model, quantized to 4-bit for efficient running on Apple Silicon (e.g., M3).

Converted with: mlx-lm v0.22.3
Memory Usage: ~4.2GB

Downloads last month: 4

Safetensors

Model size

1B params

Tensor type

F16

·

U32

·

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for darrenkwlai/Llama-3.1-8B-Instruct-dl-MLX-4bit

Base model

meta-llama/Llama-3.1-8B

Finetuned

meta-llama/Llama-3.1-8B-Instruct

Quantized

(543)

this model