dl-Llama-3.1-8B-Instruct-MLX-4bit

This is an MLX-converted version of Meta's Llama 3.1-8B Instruct model, quantized to 4-bit for efficient running on Apple Silicon (e.g., M3).

  • Converted with: mlx-lm v0.22.3
  • Memory Usage: ~4.2GB
Downloads last month
4
Safetensors
Model size
1B params
Tensor type
F16
·
U32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for darrenkwlai/Llama-3.1-8B-Instruct-dl-MLX-4bit

Quantized
(543)
this model