Oceanboi t1_jd6g49h wrote on March 22, 2023 at 4:12 AM

Reply to [Project] Machine Learning for Audio: A library for audio analysis, feature extraction, etc by Leo_D517

How do these handmade features compare to features identified by CNNs? Only reason I ask is that I'm finishing up some thesis work on sound event detection using different spectral representations as inputs to CNNs (Cochleagram, Linear Gammachirp, Logarithmic Gammachirp, Approximate Gammatone filters, etc). Wondering how these features perform in comparison on similar tasks (UrbanSound8K) and where it fits in the larger scheme of things.

Oceanboi t1_jcy3f45 wrote on March 20, 2023 at 1:22 PM

Reply to How noticeable is the difference training a model 4080 vs 4090 by Numerous_Talk7940

all comes down to VRAM.

Oceanboi t1_ja8egtc wrote on February 27, 2023 at 4:43 PM

Reply to Why does my validation loss suddenly fall dramatically while my training loss does not? by Apprehensive_Air8919

It could be too much dropout. But also how large is your test data in relation to your train data and are you leaking any information from one into the other?

Oceanboi t1_j8zdely wrote on February 18, 2023 at 1:14 AM

Reply to comment by crimson1206 in Physics-Informed Neural Networks by vadhavaniyafaijan

Oh I see, I missed the major point that the training data is basically incomplete to model the entire relationship.

Why embed priors into neural networks, doesn’t Bayesian Modeling using MCMC do pretty much what this is attempting to do? We did something similar to this in one of my courses although we didn’t get to spend enough time on it so forgive me if my questions are stupid. I also would need someone to walk me through a motivating example for a PINN because I’d just get lost in generalities otherwise. I get the example, but am failing to see the larger use case.

Oceanboi t1_j8uygkt wrote on February 17, 2023 at 3:09 AM

Reply to Physics-Informed Neural Networks by vadhavaniyafaijan

why was the neural network stopped at like 1000 steps? why are we comparing a physics informed neural network to a neural network at a different number of steps lol

Also correct me if I'm wrong but don't we care about how the model generalizes? I think we can show that some NN will fit to any training set perfectly given enough steps, but this is already common knowledge no?

Oceanboi t1_j8l8tst wrote on February 15, 2023 at 3:08 AM

Reply to comment by Oceanboi in HOW DO I KNOW THE STEP BY STEP PROCESSES INVOLVED IN SOUND CLASSIFICATION USING CNN. Can anyone help me by pointing me towards the rght direction, be it a paid course or whatever.... I need to know how to implement it for my project work. I would really appreciate your help. Thanks by Illustrious-Force-74

I’m guessing your company won’t have the resources or data to train a CNN to convergence from scratch, so read up on some common CNNs that people use for audio transfer learning (EfficientNet has worked well for me, as did ResNet50, albeit less so). Once you can implement one pre trained model, you can implement most of them fairly easily to see which one suits your task best. Also read up on Sharan et al 2019 and 2021 as he benchmarks numerous image representations, model architectures, and network fusion techniques. While results may very, empirically it is a great starting point although I was not able to achieve his results given his model architecture. Pay less attention to the actual architecture he talks about because you’ll most likely be doing transfer learning where you’ll be importing a model and it’s weights. For preprocessing look into either MATLAB for their Auditory Modeling toolbox and if you’re using python look into librosa, torchaudio, and brian2hears for more complex filterbank models.

Oceanboi t1_j8l880s wrote on February 15, 2023 at 3:03 AM

Reply to HOW DO I KNOW THE STEP BY STEP PROCESSES INVOLVED IN SOUND CLASSIFICATION USING CNN. Can anyone help me by pointing me towards the rght direction, be it a paid course or whatever.... I need to know how to implement it for my project work. I would really appreciate your help. Thanks by Illustrious-Force-74

Oooooo buddy. You’re in for a ride. Check out some PyTorch documentation. There’s plenty related to audio classification

Oceanboi t1_j78231w wrote on February 4, 2023 at 8:05 PM

Reply to 15 years old and bad at math [D] by Daniel_C_____

implement whatever you can. math be damned. you can always learn the math when you need to explain what you've done :)

Oceanboi t1_j70ns8o wrote on February 3, 2023 at 5:36 AM

Reply to comment by Dear-Acanthisitta698 in [p] Is it possible to add more classes to an already trained resnet image classifier model without the need to retrain it in all dataset again? [p] by YukkiiCode

But ALSO if it does not cost you anything…why not try it just to see what happens lol

Oceanboi t1_j5n6p7b wrote on January 24, 2023 at 4:35 AM

Reply to comment by [deleted] in [D] Simple Questions Thread by AutoModerator

my advice is to proceed. its cool to know the math underneath, but just go implement stuff dude, if it doesn't work you can always remote/rent GPU. what i did for my thesis is google tutorials and re-implement them using my dataset. through all the bugs and the elbow grease, you will know enough to at least speak the language. just do it and don't procrastinate with these types of posts (i do this too sometimes)

EDIT: a lot can be done on colab these days regarding neural networks and huggingface. google huggingface documentation! i implemented a huggingface transformer model to do audio classification (and im a total noob i just copied a tutorial). it was total misuse of the model and accuracy was bad, but at least i learned and given a real problem i could at least find my way forward.

Oceanboi t1_j5n6ed0 wrote on January 24, 2023 at 4:32 AM

Reply to comment by trnka in [D] Simple Questions Thread by AutoModerator

Can you expand on why one might ever want to apply a neural network to linear regression? It feels like bringing a machine gun to a knife fight.

Oceanboi t1_j3ef4wv wrote on January 7, 2023 at 11:52 PM

Reply to [D] Is it a time to seriously regulate and restrict AI research? by Baturinsky

A bit surprised to see the cavalier sentiments on here. I often wonder if they will eventually require commercials to disclose when it is computer generated (unreal engine 5 demos have fooled me a few times), and deepfakes come to mind as major problems (just saw an Elon one that took me embarrassingly too long to identify as fake). I don’t think tons of regulation should occur per say; other than certain legal disclosures for certain forms of media to prevent misinformation.

Oceanboi OP t1_j0cpolp wrote on December 15, 2022 at 6:16 PM

Reply to comment by schludy in [D] Tensorflow vs. PyTorch Memory Usage by Oceanboi

Data is handled both the same way. I think it has to do with what u/MrFlufypants said, because when I restart my kernel and run it after freeing up some resources, it runs. I think the number of filters I was setting are right at the threshold in which my GPU runs out of VRAM, so small memory management differences in TF and PyTorch are causing TF to hit the limit faster than PyTorch.

Oceanboi t1_iz2tn86 wrote on December 6, 2022 at 1:24 AM

Reply to comment by VirtualHat in [D] Determining the right time to quit training (CNN) by thanderrine

Is this natural error rate purely theoretical or is there some effort to quantify a ceiling?

If I’m understanding correctly, you’re saying there is always going to be some natural ceiling to accuracy for some problems in which the X data doesn’t hold enough information to perfectly predict Y, or in nature just doesn’t help us predict Y?

Oceanboi t1_iz2bc3h wrote on December 5, 2022 at 11:07 PM

Reply to comment by VirtualHat in [D] Determining the right time to quit training (CNN) by thanderrine

Do you know if this is done by simply training for massive amounts of epochs and adding layers until you hit 100%?

I may still just be new, but I’ve never been able to achieve this in practice. I’d be really interested in practical advice on how to overfit your dataset. I still unsure of the results of that paper you linked, I feel like I am misinterpreting it in some way.

Is it really suggesting overparametrizing something past the overfitting point and continuing to train will ultimately yield a model that generalizes well?

I am using a data set of 8500 sounds, 10 classes. I cannot push past 70-75% accuracy and the more layers I add to the Convolutional base, the lower my accuracy becomes. Are they suggesting the layers be added to the classifier head only? I’m all for overparametrizing a model and leaving it on for days, I just don’t know how to be deliberate in this effort.

Oceanboi t1_iyvf94s wrote on December 4, 2022 at 1:58 PM

Reply to comment by VirtualHat in [D] Determining the right time to quit training (CNN) by thanderrine

This confuses me. I have almost never gotten an audio or image model over 90% accuracy - and it seems to be largely problem domain and data dependent. I've never been able to reach very low training loss. Does this only apply to extremely large datasets?

If I had a model that had 90% train and test accuracy, is that not a really good and adequately fit model for any real business solution? Obviously if it was a model that predicted a severe consequence, we'd want closer to 100%, but that's the IDEAL, right?

Oceanboi t1_iyq03g2 wrote on December 3, 2022 at 6:31 AM

Reply to comment by Superschlenz in [D] In an optimal world, how would you wish variance between runs based on different random seeds was reported in papers? by optimized-adam

Why do you say an optimal learning algorithm should have zero hyperparameters? Are you saying an optimal neural network would learn things like batch size, learning rate, optimal optimizer (lol), input size, etc, on its own? In this case wouldn't a model with zero hyperparameters be the same conceptually as a model that has been tuned to the optimal hyperparameter combination?

Theoretically you could make these hyperparameters trainable if you had the coding chops, so why are we still as a community tweaking hyperparameters iteratively?

Oceanboi t1_iypx661 wrote on December 3, 2022 at 5:56 AM

Reply to comment by YamEnvironmental4720 in [D] Entropy in feature engineering by YamEnvironmental4720

Half spaces, hyper planes, hmm. It seems as though my current understanding of entropy is very limited. Could you link me some relevant material so I can understand what a "zero level hypersurface" is? I only have ever seen simple examples of entropy / gini impurity for splitting random forest so I'm interested in learning more.

Oceanboi t1_iypwn9c wrote on December 3, 2022 at 5:50 AM

Reply to comment by Superschlenz in [D] In an optimal world, how would you wish variance between runs based on different random seeds was reported in papers? by optimized-adam

Could you elaborate on why? Just curious. What is the alternative?

Oceanboi OP t1_ixtiolt wrote on November 26, 2022 at 5:44 AM

Reply to comment by NLP_doofus in [D] Transfer Learning of Image Trained Network in Audio Domain by Oceanboi

Not from scratch for all, I simply want to take a base model or a set of base model architectures and compare how different audio representations (cochleagram, and other cochlear models) perform in terms of accuracy/model performance. That’s what got me to look into transfer learning and hence the question! I need some constant set of models to use for my comparisons.

Oceanboi OP t1_ixpeeaf wrote on November 25, 2022 at 6:36 AM

Reply to comment by NLP_doofus in [D] Transfer Learning of Image Trained Network in Audio Domain by Oceanboi

The problem is that I am testing different data representations of audio, so the pre processing is what I want to experiment with.

Oceanboi OP t1_ixlzu9h wrote on November 24, 2022 at 1:31 PM

Reply to comment by hadaev in [D] Transfer Learning of Image Trained Network in Audio Domain by Oceanboi

Maybe not always, but couldn't you argue that good trained weights for one task may not carry over well to another?

Oceanboi OP t1_ixkfcoh wrote on November 24, 2022 at 2:38 AM

Reply to comment by asdfzzz2 in [D] Transfer Learning of Image Trained Network in Audio Domain by Oceanboi

So all we really know is that if a model has been trained on some previous task, there’s some arbitrary probability that it can be used for another problem, regardless of image contents or problem domain?

Oceanboi OP t1_ixkc2dn wrote on November 24, 2022 at 2:11 AM

Reply to comment by GeneralBh in [D] Transfer Learning of Image Trained Network in Audio Domain by Oceanboi

It is trained on AudioSet. I listed YAMNet to highlight the lack of large audio models when compared to image models. And highlight the problem that it limits your data input due to its architecture.

Also, I mainly see transfer learning for CNN in kaggle notebooks, and could find a few papers where an image net is used as one of the models being used.

https://arxiv.org/pdf/2007.07966

https://research.google/pubs/pub45611/

These are just a few but it seems decently common.