Instructions to use moondream/moondream3-preview with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use moondream/moondream3-preview with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("image-text-to-text", model="moondream/moondream3-preview", trust_remote_code=True)# Load model directly from transformers import AutoModelForCausalLM model = AutoModelForCausalLM.from_pretrained("moondream/moondream3-preview", trust_remote_code=True, dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use moondream/moondream3-preview with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "moondream/moondream3-preview" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "moondream/moondream3-preview", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/moondream/moondream3-preview
- SGLang
How to use moondream/moondream3-preview with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "moondream/moondream3-preview" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "moondream/moondream3-preview", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "moondream/moondream3-preview" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "moondream/moondream3-preview", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use moondream/moondream3-preview with Docker Model Runner:
docker model run hf.co/moondream/moondream3-preview
License Clarification
Hi! The terms on the license say:
You may use the Licensed Work and Derivatives for any purpose, including commercial use, and you may self-host them for your or your organization’s internal use. You may not provide the Licensed Work, Derivatives, or any service that exposes their functionality to third parties (including via API, website, application, model hub, or dataset/model redistribution) without a separate commercial agreement with the Licensor.
I am curious as to what this means for, as an example, synthetic datasets; if we were to i.e. make a dataset of synthetic caption data with MD3, can we redistribute this? Or, if we finetuned the model, would we be allowed to redistribute the model? (It currently says that you can't 'expose their functionality' to a 'model hub', which I take to mean you can't upload them to Huggingface).
I would appreciate some clarification here. Thanks!
we were to i.e. make a dataset of synthetic caption data with MD3, can we redistribute this?
Yes, if the assets are generated by you or your organization, (not live generated or manipulated by third-party)
if we finetuned the model, would we be allowed to redistribute the model?
absolute not, stated very clearly, can't upload/share this model or its derivatives to anywhere that would allow third-party access it or it's functionality
Rule of thumb: outputs are yours, models are restricted.
this is not legal advise
License has been updated to improve clarity and make it less restrictive. Talked to @Fizzarolli on Discord so I think we're good here, going to close but please reopen if anyone has further questions.