whisper-large-turbo-finetune

This model is a fine-tuned version of adriszmar/whisper-large-v3-turbo-es on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0337
  • Wer: 6.6030

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 16
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 100
  • num_epochs: 6
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer
0.5524 0.4324 200 0.4671 36.8885
0.3595 0.8649 400 0.3303 27.6716
0.219 1.2962 600 0.2166 26.2860
0.1551 1.7286 800 0.1712 19.2001
0.0996 2.16 1000 0.1329 18.1923
0.0717 2.5924 1200 0.1003 14.0248
0.0563 3.0238 1400 0.0790 12.8280
0.0338 3.4562 1600 0.0630 11.9253
0.0276 3.8886 1800 0.0480 6.6240
0.0161 4.32 2000 0.0337 6.6030
0.0104 4.7524 2200 0.0267 4.8604
0.0058 5.1838 2400 0.0208 4.9339
0.0032 5.6162 2600 0.0177 3.1388

Framework versions

  • Transformers 4.57.1
  • Pytorch 2.8.0+cu126
  • Datasets 4.0.0
  • Tokenizers 0.22.1
Downloads last month
113
Safetensors
Model size
0.8B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for tangering-ai/whisper-large-turbo-finetune

Evaluation results