FireRedVAD-C — FRVD weights for the pure-C inference engine

Pre-converted weights for running FireRedTeam/FireRedVAD on the zero-dependency C inference engine used by mod_fireredvad (FreeSWITCH module) and fireredvad-dart (Flutter package).

The PyTorch checkpoints ship as model.pth.tar files and require torch + kaldi at inference time. This repo strips them down to a single flat float32 blob plus a JSON CMVN file, suitable for embedding in C, Dart, or any runtime that just wants fread() + matmul.

Files

File	Size	Description
`fireredvad.bin`	4.41 MB	FRVD weights — VAD + AED, LE float32
`fireredvad.json`	3.2 KB	CMVN stats (`means`, `inv_std`) — 80 bins
`export_frvd.py`	—	Reproducible export script (PyTorch → FRVD)

Source models

VAD: FireRedTeam/FireRedVAD/Stream-VAD — streaming-trained DFSMN, no lookahead used at inference (causal).
AED: FireRedTeam/FireRedVAD/AED — non-streaming DFSMN with lookahead, 3-class (speech / music / noise).
CMVN: kaldi cmvn.ark from the same upstream repo, converted to JSON.

Architecture

DFSMN with shared topology for VAD and AED:

	VAD (Stream-VAD)	AED
Input dim (mel bins)	80	80
Hidden	256	256
Projection	128	128
FSMN blocks (R)	8	8
Lookback order (N1)	20	20
Lookahead order (N2)	20 (skipped at inference)	20
Output classes	1 (sigmoid)	3 (softmax)
Parameters	567,937	588,931

FRVD binary format

offset  size                  field
0       4 bytes               magic = "FRVD"
4       uint32 little-endian  version = 1
8       float32[]             VAD weights (see fireredvad.h::VadWeights)
...     float32[]             AED weights (see fireredvad.h::AedWeights)

VAD layout (in read order):

inp_fc1_w[80*256], inp_fc1_b[256]
inp_fc2_w[256*128], inp_fc2_b[128]
fsmn0_lookback[128*20]
7 × {fc1_w[128*256], fc1_b[256], fc2_w[256*128], lookback[128*20]}
out_fc1_w[128*256], out_fc1_b[256]
out_fc2_w[256*1], out_fc2_b[1]

AED layout adds lookahead at every FSMN site and uses 3-class output:

inp_fc1_w[80*256], inp_fc1_b[256], inp_fc2_w[256*128], inp_fc2_b[128]
fsmn0_lookback[128*20], fsmn0_lookahead[128*20]
7 × {fc1_w, fc1_b, fc2_w, lookback, lookahead}
out_fc1_w[128*256], out_fc1_b[256]
out_fc2_w[256*3], out_fc2_b[3]

Linear weights are stored row-major as [in, out] (PyTorch's Linear.weight transposed). Depthwise Conv1d filters are stored as [P, K].

Usage

Download

from huggingface_hub import hf_hub_download

bin_path = hf_hub_download(
    repo_id="eschmidbauer/fireredvad-c", filename="fireredvad.bin"
)
json_path = hf_hub_download(
    repo_id="eschmidbauer/fireredvad-c", filename="fireredvad.json"
)

Or with the CLI:

huggingface-cli download eschmidbauer/fireredvad-c --local-dir models/

C (FreeSWITCH module)

mod_fireredvad loads the files directly:

Cmvn cmvn;
VadWeights vad;
AedWeights aed;

fireredvad_load_cmvn("fireredvad.json", &cmvn);
fireredvad_load_weights("fireredvad.bin", &vad, &aed);

Dart (Flutter)

fireredvad-dart bundles the same files as Flutter assets and parses them in pure Dart.

Reproducing

The export_frvd.py script downloads the upstream PyTorch checkpoints and writes byte-identical fireredvad.bin + fireredvad.json:

uv run export_frvd.py

Dependencies (handled automatically by uv from the inline PEP 723 metadata): torch, numpy, kaldiio, huggingface_hub, fireredvad.

License

Apache 2.0, inherited from the upstream FireRedVAD release. The original model authors retain credit for training; this repo only provides a repackaged binary form.

Citation

@misc{fireredvad,
  title  = {FireRedVAD: A SOTA Industrial-Grade Voice Activity
            Detection \& Audio Event Detection},
  author = {Xu, Kaituo and Li, Wenpeng and Huang, Kai and Liu, Kun},
  year   = {2026},
  howpublished = {\url{https://github.com/FireRedTeam/FireRedVAD}},
}

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

Voice Activity Detection

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for eschmidbauer/fireredvad-c

Base model

FireRedTeam/FireRedVAD

Finetuned

(1)

this model