MTP Support?

by LeePapa - opened 7 days ago

7 days ago

Does the model support multi-token prediction? if so, how do you configure it within inference engines like vLLM or llama.cpp?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment