dfcHeadChair t1_jbaljm8 wrote on March 7, 2023 at 5:47 PM

MLP for speech recognition probably isn't a great solution, but if it's for a class and you can only use numpy start here: https://towardsdatascience.com/coding-a-neural-network-from-scratch-in-numpy-31f04e4d605

alexilas OP t1_jban3hs wrote on March 7, 2023 at 5:57 PM

Thanks for that link!! But out of curiosity, what would you use instead of a MLP?

dfcHeadChair t1_jbau8dy wrote on March 7, 2023 at 6:42 PM

If you’re only detecting speech, that is doable with heuristics and some napkin math, or an MLP, for simple cases. However, “detect speech in this audio” is rarely the end of the story in the real world. Next up comes transcription, sentiment analysis, tonal feature flagging, etc. all of which are currently dominated by Transformers. You’ll also see some great work in the RNN space, but Transformer-based architectures are king right now.

Some models for inspiration, https://huggingface.co/models?pipeline_tag=automatic-speech-recognition&sort=downloads

alexilas OP t1_jbazarg wrote on March 7, 2023 at 7:14 PM

Thanks!! I really appreciate. I really like the ai world and if it's not too much to ask, if you have anything else you would recommend me to go further I would appreciate it. Again thanks!!

dfcHeadChair t1_jbb2dyi wrote on March 7, 2023 at 7:34 PM

Yep feel free