RedHatAI/DeepSeek-R1-0528-quantized.w4a16 · Is 4 x H20 96G sufficient to run this model?

Is 4 x H20 96G sufficient to run this model?

by milongwong - opened Jun 3

Jun 3

We have limited resource and have questions below:

Jun 3

•

The size of quantized params is 346GB. Still very large

4 x H20 96G can run it. But the context length will be very short.

Red Hat AI org Aug 27

I don't think the model would fit on that configuration, especially when considering that on top of weights you need extra memory for reasonably large context size.
We are creating our models for vLLM specifically. We are not aware of the compatibility with SGlang.

ekurtic changed discussion status to closed Aug 27

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment