Submitted by Illustrious-Force-74 t3_112k80y in deeplearning
Comments
Oceanboi t1_j8l880s wrote
Oooooo buddy. You’re in for a ride. Check out some PyTorch documentation. There’s plenty related to audio classification
Oceanboi t1_j8l8tst wrote
I’m guessing your company won’t have the resources or data to train a CNN to convergence from scratch, so read up on some common CNNs that people use for audio transfer learning (EfficientNet has worked well for me, as did ResNet50, albeit less so). Once you can implement one pre trained model, you can implement most of them fairly easily to see which one suits your task best. Also read up on Sharan et al 2019 and 2021 as he benchmarks numerous image representations, model architectures, and network fusion techniques. While results may very, empirically it is a great starting point although I was not able to achieve his results given his model architecture. Pay less attention to the actual architecture he talks about because you’ll most likely be doing transfer learning where you’ll be importing a model and it’s weights. For preprocessing look into either MATLAB for their Auditory Modeling toolbox and if you’re using python look into librosa, torchaudio, and brian2hears for more complex filterbank models.
Illustrious-Force-74 OP t1_j8mb2il wrote
Awesome...Thanks!
Illustrious-Force-74 OP t1_j8mb2v8 wrote
Awesome...Thanks!
exclaim_bot t1_j8mb33e wrote
>Awesome...Thanks!
You're welcome!
exclaim_bot t1_j8mb3s2 wrote
>Awesome...Thanks!
You're welcome!
Nerveregenerator t1_j99gc4f wrote
You just use mfcc and then it’s just like image detection
ZaZaMood t1_j8kurf5 wrote
Brooo there is sooo many. No need to pay. Just Google
CNN audio classification site:GitHub.com
Example
https://github.com/jeffprosise/Deep-Learning/blob/master/Audio%20Classification%20(CNN).ipynb