Flash-VStream-Qwen-7b Model Card

Model details

We proposed Flash-VStream, an efficient VLM with a novel Flash Memory mechanism that enables real-time understanding and Q&A of extremely long video streams. Our model achieves outstanding accuracy and efficiency on EgoSchema, MLVU, LVBench, MVBench and Video-MME Benchmarks.

License

This project is licensed under the Apache 2.0 License.

Downloads last month: 16

Safetensors

Model size

8B params

Tensor type

BF16

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Paper for zhang9302002/Flash-VStream-Qwen-7b

Flash-VStream: Efficient Real-Time Understanding for Long Video Streams

Paper • 2506.23825 • Published Jun 30, 2025