Instructions to use espnet/owsm_v4_medium_1B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- ESPnet
How to use espnet/owsm_v4_medium_1B with ESPnet:
from espnet2.bin.asr_inference import Speech2Text model = Speech2Text.from_pretrained( "espnet/owsm_v4_medium_1B" ) speech, rate = soundfile.read("speech.wav") text, *_ = model(speech)[0] - Notebooks
- Google Colab
- Kaggle
where can i find the v4 paper
#1
by StephennFernandes - opened
Hi i am a huge fan of your work, while being actively following all of the research findings that the OWSM series brings to the open sourced community.
please may i know where could i find the link to read the OWSM v4 paper.
specifically:
@inproceedings{owsm-v4,
title={{OWSM} v4: Improving Open Whisper-Style Speech Models via Data Scaling and Cleaning},
author={Yifan Peng and Shakeel Muhammad and Yui Sudo and William Chen and Jinchuan Tian and Chyi-Jiunn Lin and Shinji Watanabe},
booktitle={Proceedings of the Annual Conference of the International Speech Communication Association (INTERSPEECH) (accepted)},
year={2025},
}
even the preprints incase they are available anywhere.
Hi @StephennFernandes , thank you so much for your interest in our series of works! I have uploaded the paper to arXiv and I guess it will be released in one or two days.
StephennFernandes changed discussion status to closed