AKavun
AKavun OP t1_j3l4gb1 wrote
u/trajo123 u/FastestLearner u/trajo123
I am giving this as a general update. In my original post, I said "I am doing something very obvious wrong" and indeed I was. The reason my model did not learn at all was that the whole python script with the exception of my main method was being re-executed every few seconds which actually caused my model to reinitilize and reset. I believe this was caused by PyTorch's handling of the "num_workers" parameter in the dataloader which tries to do some multithreading magic and ends up re-executing the script multiple times.
So fixing that allowed my model to learn but it still performed poorly due to the reasons all of you so generously explained in great detail. My first instinctive reaction to this was to switch to resnet18 and change the output layer. I also switched to crossentropy loss as I learned I can still use softmax in postprocessing to obtain the prediction confidence, this was something I did not think it was possible to do previously. Now my model performs with 90% accuracy in my test set and rest I think is just tweaking the hyperparameters, enlarging and augmenting the data, and maybe doing some partial training with different learning rates etc.
However I still do want to learn how to design an architecture from scratch so I am experimenting with that after carefully reading the answers you provided. I thank each of you so much and wish all the success in your careers. You are great people and we are a great community
AKavun OP t1_j3btlem wrote
Reply to comment by suflaj in Why didn't my convolutional image classifier network learn anything! by AKavun
I also have a validation accuracy metric of around %50 which is basically the expected value of a random variable.
I removed the weight decay to keep things simpler and adjusted the learning rate to 0.0003. I will update this thread on the results.
Thank you for taking the time to help
Submitted by AKavun t3_105na47 in deeplearning
AKavun OP t1_irx83tz wrote
Reply to comment by Electronic-Art-2105 in [P] Making attribute classification on an image of a clothing by AKavun
>I see. In the tutorial, for each output, a 1-dimensional Dense layer with a sigmoid activation function is used, along with binary crossentropy as the loss function. You could exchange that by an n-dimensional Dense layer with softmax activation, along with categorical crossentropy. So the basic architecture can remain similar, you just have to adapt the outputs.
I will first learn what these things mean, then I will get back to you. Thank you for your guidance.
AKavun OP t1_irwurmu wrote
Reply to comment by Electronic-Art-2105 in [P] Making attribute classification on an image of a clothing by AKavun
Yeah, this is mostly similar to what I want to do but there is a difference.
In the tutorial, there are only binary attributes like if a celebrity is bald or not. But I want to do multi-value attributes like the color of the clothing which can take a lot of values, not just 1 or 0.
With this in mind, is it still multilabel classification
AKavun OP t1_irvc02x wrote
Reply to comment by Seankala in [P] Making attribute classification on an image of a clothing by AKavun
Thank you, I am reading about it right now!
AKavun OP t1_irvbvrj wrote
Reply to comment by PassionatePossum in [P] Making attribute classification on an image of a clothing by AKavun
As I said, I am a beginner to this stuff. Even though I am familiar with every term in that sentence, can you maybe share some articles or videos that are doing or explaining something similar to what you have in mind so that I can understand you better?
Submitted by AKavun t3_y0zbfj in MachineLearning
AKavun OP t1_j3l51kx wrote
Reply to comment by trajo123 in Why didn't my convolutional image classifier network learn anything! by AKavun
Thank you sir, I posted a general update to this thread and I will be further updating you about everything.