wer_list = []
cer_list = []
model_size_list = []
time_list = []Evaluation Malayalam Speech Corpus(MSC) dataset
Loading dataset and evaluating model
load_malayalam_speech_corpus_dataset
load_malayalam_speech_corpus_dataset ()
Evaluating Whisper based model
evaluate_whisper_model_msc
evaluate_whisper_model_msc (model_name:str, werlist:List[float], cerlist:List[float], modelsizelist:List[str], timelist:List[float], bs:int=16)
| Type | Default | Details | |
|---|---|---|---|
| model_name | str | The model name | |
| werlist | typing.List[float] | WER List | |
| cerlist | typing.List[float] | CER list | |
| modelsizelist | typing.List[str] | model size list | |
| timelist | typing.List[float] | time(s) list | |
| bs | int | 16 | batch size |
| Returns | None |
Testing with a sample model
evaluate_whisper_model_msc("openai/whisper-tiny", wer_list, cer_list, model_size_list, time_list)KeyboardInterrupt:
evaluate_whisper_model_msc("anuragshas/whisper-large-v2-ml", wer_list, cer_list, model_size_list, time_list, bs=4)Evaluating Faster-whisper based models
evaluate_faster_whisper_model_msc
evaluate_faster_whisper_model_msc (model_name:str, werlist:List[float], cerlist:List[float], modelsizelist:List[str], timelist:List[float], bs:int=16, compute_type:str='float16', beam_size=1)
A utility function for calculing WER in Common voice dataset provided a model name in huggingface. You can store a WER, CER, ModelSize, TimeList to calculate results cumulatively over different epochs
| Type | Default | Details | |
|---|---|---|---|
| model_name | str | The model name | |
| werlist | typing.List[float] | WER List | |
| cerlist | typing.List[float] | CER list | |
| modelsizelist | typing.List[str] | model size list | |
| timelist | typing.List[float] | time(s) list | |
| bs | int | 16 | batch size. Default value is 16. |
| compute_type | str | float16 | The compute type supported by faster-Whisper |
| beam_size | int | 1 | beam size |
| Returns | None |
Evaluating faster-Whisper based model
wer_list = []
cer_list = []
model_size_list = []
time_list = []
evaluate_faster_whisper_model_msc("kurianbenoy/vegam-whisper-medium-ml-fp16", wer_list, cer_list, model_size_list, time_list)
wer_list, cer_list, model_size_list, time_listMade by Kurian Benoy. See the code.