Harveenchadha/vakyansh-wav2vec2-tamil-tam-250

Name: Harveenchadha/vakyansh-wav2vec2-tamil-tam-250
Rating: 5 (2 reviews)
Author: Harveenchadha

automatic speech recognitiontransformerstatransformerspytorchwav2vec2automatic-speech-recognitionaudiospeechmit

2

HuggingFace

1.4M

Preprocessing the datasets.

We need to read the aduio files as arrays

def speech_file_to_array_fn(batch): batch["sentence"] = re.sub(chars_to_ignore_regex, '', batch["sentence"]).lower() speech_array, sampling_rate = torchaudio.load(batch["path"]) batch["speech"] = resampler(speech_array).squeeze().numpy() return batch

test_dataset = test_dataset.map(speech_file_to_array_fn)

Preprocessing the datasets.

We need to read the aduio files as arrays

def evaluate(batch): inputs = processor(batch["speech"], sampling_rate=16_000, return_tensors="pt", padding=True)

with torch.no_grad(): logits = model(inputs.input_values.to("cuda")).logits

  pred_ids = torch.argmax(logits, dim=-1)
  batch["pred_strings"] = processor.batch_decode(pred_ids, skip_special_tokens=True)
  return batch

result = test_dataset.map(evaluate, batched=True, batch_size=8)

print("WER: {:2f}".format(100 * wer.compute(predictions=result["pred_strings"], references=result["sentence"])))


**Test Result**: 53.64 %

[**Colab Evaluation**](https://github.com/harveenchadha/bol/blob/main/demos/hf/tamil/hf_vakyansh_tamil_tnm_4200_evaluation_common_voice.ipynb) 

## Credits
Thanks to Ekstep Foundation for making this possible. The vakyansh team will be open sourcing speech models in all the Indic Languages.

Deploy Model on Runcrate

Run this model on powerful GPU infrastructure. Deploy in 60 seconds.

Pay per second

H100, A100, RTX GPUs

Instant deployment

DEPLOY IN 60 SECONDS

Run vakyansh-wav2vec2-tamil-tam-250 on Runcrate

Deploy on H100, A100, or RTX GPUs. Pay only for what you use. No setup required.