Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
Duplicated from
microsoft/Phi-4-multimodal-instruct
mjtechguy
/
phi-4-multimodal-instruct
like
0
Automatic Speech Recognition
Transformers
Safetensors
24 languages
phi4mm
text-generation
nlp
code
audio
speech-summarization
speech-translation
visual-question-answering
phi-4-multimodal
phi
phi-4-mini
custom_code
arxiv:
2407.13833
License:
mit
Model card
Files
Files and versions
xet
Community
Deploy
Use this model
main
phi-4-multimodal-instruct
/
figures
744 kB
2 contributors
History:
1 commit
mjtechguy
Duplicate from microsoft/Phi-4-multimodal-instruct
df2cb0d
verified
12 months ago
audio_understand.png
42.6 kB
Duplicate from microsoft/Phi-4-multimodal-instruct
12 months ago
multi_image.png
192 kB
Duplicate from microsoft/Phi-4-multimodal-instruct
12 months ago
speech_qa.png
46.8 kB
Duplicate from microsoft/Phi-4-multimodal-instruct
12 months ago
speech_recog_by_lang.png
90.7 kB
Duplicate from microsoft/Phi-4-multimodal-instruct
12 months ago
speech_recognition.png
63.5 kB
Duplicate from microsoft/Phi-4-multimodal-instruct
12 months ago
speech_summarization.png
41 kB
Duplicate from microsoft/Phi-4-multimodal-instruct
12 months ago
speech_translate.png
47.7 kB
Duplicate from microsoft/Phi-4-multimodal-instruct
12 months ago
speech_translate_2.png
46.3 kB
Duplicate from microsoft/Phi-4-multimodal-instruct
12 months ago
vision_radar.png
174 kB
Duplicate from microsoft/Phi-4-multimodal-instruct
12 months ago