QuarkAudio Technical Report
Paper
β’
2512.20151
β’
Published
ckpt_path in file conf/config_adaptive_v3.yaml is validgit clone https://github.com/alibaba/unified-audio.git
cd QuarkAudio-HCodec
conda create -n unise python=3.10
conda activate unise
pip install -r requirements.txt
#!/bin/bash
python audio_tokenizer.py
# hyperparameter configuration in conf/config_adaptive_v3.yaml
training: false # keep false when testing
use_similarity_alignment: true
use_dynamic_similarity_threshold: false
infer_using_dynamic_threshold: true # work when manual_threshold is null
similarity_threshold: 0.7
similarity_threshold_lower: 0.7
similarity_threshold_upper: 1.0 # valid interval of dynamic threshold when 'infer_using_dynamic_threshold' turns on
max_tokens_per_group: 8
manual_threshold: 0.6 # set to a fixed value when evaluate specific threshold
We would like to thank the great work of following projects: