Submitted by ruizard t3_10px4n6 in MachineLearning
Hi all, I'm trying to fine-tune Whisper AI to transcribe albanian speech to text but I have a problem in that I don't know how the dataset for training whisper model should look like.
I already have voice audios and the transcript for that audio file but I need to know how to reformat it into a valid dataset for training Whisper.
Thanks in advance!
pronunciaai t1_j6mqq5n wrote
Huggingface just finished a sprint where they fine-tuned whisper on 100s of languages, going to their discord and following the guides is going to be by far the easiest way.
Check under ML-4-AUDIO channels "sprint-announcement", "discussions", and "whisper-model-playground"