Pretrained ARC-Encoders and a fine-tuning dataset: context compression for unmodified LLMs.
Kyutai
non-profit
Verified
AI & ML interests
None defined yet.
Recent Activity
Papers
View all Papers
https://kyutai.org/next/stt
MoshiVis is a Vision Speech Model built as a perceptually-augmented version of Moshi v0.1 for conversing about image inputs
MLX, Candle & PyTorch model checkpoints released as part of the Moshi release from Kyutai. Run inference via: https://github.com/kyutai-labs/moshi
Pretrained ARC-Encoders and a fine-tuning dataset: context compression for unmodified LLMs.
https://kyutai.org/next/tts
https://kyutai.org/next/stt
Helium 1: a modular and multilingual LLM
MoshiVis is a Vision Speech Model built as a perceptually-augmented version of Moshi v0.1 for conversing about image inputs
Hibiki is a model for streaming speech translation , which can run on device! See https://github.com/kyutai-labs/hibiki.
MLX, Candle & PyTorch model checkpoints released as part of the Moshi release from Kyutai. Run inference via: https://github.com/kyutai-labs/moshi