Evaluation Common Voice - malayalam subset dataset

Load Common Voice 11.0 Malayalam Subset


source

load_common_voice_malayalam_dataset

 load_common_voice_malayalam_dataset ()

Transformer Whisper models


source

evaluate_whisper_model_common_voice

 evaluate_whisper_model_common_voice (model_name:str, werlist:List[float],
                                      cerlist:List[float],
                                      modelsizelist:List[str],
                                      timelist:List[float], bs:int=16)

A utility function for evaluating Whisper based models in Common voice dataset malayalam subset provided a model name in huggingface. You can store a WER, CER, ModelSize, TimeList to calculate results cumulatively over different epochs

Type Default Details
model_name str The model name
werlist typing.List[float] WER List
cerlist typing.List[float] CER list
modelsizelist typing.List[str] model size list
timelist typing.List[float] time(s) list
bs int 16 batch size. Default value is 16.
Returns None

Testing with a sample model

wer_list = []
cer_list = []
model_size_list = []
time_list = []
evaluate_whisper_model_common_voice("parambharat/whisper-tiny-ml", wer_list, cer_list, model_size_list, time_list)
Found cached dataset common_voice_11_0 (/home/.cache/huggingface/datasets/mozilla-foundation___common_voice_11_0/ml/11.0.0/2c65b95d99ca879b1b1074ea197b65e0497848fd697fdb0582e0f6b75b6f4da0)
Loading cached processed dataset at /home/.cache/huggingface/datasets/mozilla-foundation___common_voice_11_0/ml/11.0.0/2c65b95d99ca879b1b1074ea197b65e0497848fd697fdb0582e0f6b75b6f4da0/cache-374585c2877047e3.arrow
Loading cached processed dataset at /home/.cache/huggingface/datasets/mozilla-foundation___common_voice_11_0/ml/11.0.0/2c65b95d99ca879b1b1074ea197b65e0497848fd697fdb0582e0f6b75b6f4da0/cache-22670505c562e0d4.arrow
/opt/conda/lib/python3.8/site-packages/transformers/generation_utils.py:1359: UserWarning: Neither `max_length` nor `max_new_tokens` has been set, `max_length` will default to 448 (`self.config.max_length`). Controlling `max_length` via the config is deprecated and `max_length` will be removed from the config in v5 of Transformers -- we recommend using `max_new_tokens` to control the maximum length of the generation.
  warnings.warn(
Total time taken: 59.84694576263428
The WER of model: 38.76
The CER of model: 22.21
The model size is: 37.76M
['parambharat', 'whisper-tiny-ml']
wer_list
[38.76]
cer_list
[22.21]
model_size_list
['37.76M']
time_list
[59.84694576263428]

Faster-Whisper models

model = WhisperModel("kurianbenoy/vegam-whisper-medium-ml-fp16")

dataset = load_common_voice_malayalam_dataset()
t = dataset[0]

segments, info = model.transcribe(t["audio"]["array"], beam_size=5)
print("Detected language '%s' with probability %f" % (info.language, info.language_probability))
" ".join([segment.text for segment in segments])
'ഇന്ദിര വധത്തിനെ തുടർന്നുണ്ടായ സിഖുവിരുദ്ധ കലാപമാണ് വിഭജനത്തിനു ശേഷം സ്വതന്ത്ര്യ ഇന്ത്യ കണ്ടെത്തിൽ വെച്ച'

source

evaluate_faster_whisper_model_common_voice

 evaluate_faster_whisper_model_common_voice (model_name:str,
                                             werlist:List[float],
                                             cerlist:List[float],
                                             modelsizelist:List[str],
                                             timelist:List[float],
                                             bs:int=16,
                                             compute_type:str='float16',
                                             beam_size=1)

A utility function for calculing WER in Common voice dataset provided a model name in huggingface. You can store a WER, CER, ModelSize, TimeList to calculate results cumulatively over different epochs

Type Default Details
model_name str The model name
werlist typing.List[float] WER List
cerlist typing.List[float] CER list
modelsizelist typing.List[str] model size list
timelist typing.List[float] time(s) list
bs int 16 batch size. Default value is 16.
compute_type str float16 The compute type supported by faster-Whisper
beam_size int 1 beam size
Returns None

Evaluating faster-Whisper based model

wer_list = []
cer_list = []
model_size_list = []
time_list = []
evaluate_faster_whisper_model_common_voice("kurianbenoy/vegam-whisper-medium-ml-fp16", wer_list, cer_list, model_size_list, time_list)
wer_list, cer_list, model_size_list, time_list
Total time taken: 91.5117712020874
The WER of model: 24.71
The CER of model: 18.57
['kurianbenoy', 'vegam-whisper-medium-ml-fp16']
([24.71], [18.57], [], [91.5117712020874])

Made by Kurian Benoy. See the code.