DrishtiSharma/wav2vec2-large-xls-r-300m-bg-d2

automatic speech recognitiontransformersbgtransformerspytorchtensorboardwav2vec2automatic-speech-recognitionbgapache-2.0
273.7K

wav2vec2-large-xls-r-300m-bg-d2

This model is a fine-tuned version of facebook/wav2vec2-xls-r-300m on the MOZILLA-FOUNDATION/COMMON_VOICE_8_0 - BG dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3421
  • Wer: 0.2860

Evaluation Commands

  1. To evaluate on mozilla-foundation/common_voice_8_0 with test split

python eval.py --model_id DrishtiSharma/wav2vec2-large-xls-r-300m-bg-d2 --dataset mozilla-foundation/common_voice_8_0 --config bg --split test --log_outputs

  1. To evaluate on speech-recognition-community-v2/dev_data

python eval.py --model_id DrishtiSharma/wav2vec2-large-xls-r-300m-bg-d2 --dataset speech-recognition-community-v2/dev_data --config bg --split validation --chunk_length_s 10 --stride_length_s 1

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.00025
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 32
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 700
  • num_epochs: 35
  • mixed_precision_training: Native AMP

Training results

Training LossEpochStepValidation LossWer
6.87911.742003.19021.0
3.04413.484002.80980.9864
1.14995.226000.46680.5014
0.49686.968000.41620.4472
0.35538.710000.35800.3777
0.302710.4312000.34220.3506
0.256212.1714000.35560.3639
0.227213.9116000.36210.3583
0.212515.6518000.34360.3358
0.190417.3920000.36500.3545
0.169519.1322000.33660.3241
0.153220.8724000.35500.3311
0.145322.6126000.35820.3131
0.135924.3528000.35240.3084
0.123326.0930000.35030.2973
0.111427.8332000.34340.2946
0.105129.5734000.34740.2956
0.096531.336000.34260.2907
0.092333.0438000.34780.2894
0.089434.7840000.34210.2860

Framework versions

  • Transformers 4.16.2
  • Pytorch 1.10.0+cu111
  • Datasets 1.18.3
  • Tokenizers 0.11.0
DEPLOY IN 60 SECONDS

Run wav2vec2-large-xls-r-300m-bg-d2 on Runcrate

Deploy on H100, A100, or RTX GPUs. Pay only for what you use. No setup required.