SufficientStautistic t1_izya9wk wrote on December 12, 2022 at 7:11 PM

Reply to [D] Simple Questions Thread by AutoModerator

What does Gluon offer? How does it compare to TensorFlow and PyTorch?

SufficientStautistic t1_iz1c1yp wrote on December 5, 2022 at 7:13 PM

Reply to How to choose a starting CNN architecture? by OffswitchToggle

+1 for using a model template rather than experimenting/crafting things by hand for your problem. Many good general-purpose architectures for classification exist and in my experience they work very well. For the classification problem you describe you will probably be fine using one of the architectures mentioned on the Keras CV page (or the equivalent place in the timm/pytorch docs). Recommend starting from a pretrained model.

The approach I usually take to solving a CV problem is to survey what architectures are recommended for the problem in the abstract (e.g. classification, segmentation, pose estimation etc), try those, then make modifications using details from the specifics of the problem if necessary.

Tbh you might not even need a deep vision model for your problem.

SufficientStautistic t1_iyqag72 wrote on December 3, 2022 at 8:54 AM

Reply to [D] In an optimal world, how would you wish variance between runs based on different random seeds was reported in papers? by optimized-adam

I am always delighted to see a median and accompanying central 5 and 95% quantiles at each validation step/end of each epoch. This is more helpful to me than some multiple of the s.d. A mean with SE goes a lot further than many papers, so even that I will take, just give us a measure of variance, for the love of god haha.

The answer saying that random weight initialization is not ideal is a good one, it's a pain both for reproducibility and other reasons (saw you ask about this in that thread - the variance of random initialisation has to be tuned based on depth so that the io condition number is about 1, otherwise learning is less likely to proceed as quickly or at all). Several deterministic initialisation procedures have been proposed over the years. Here is one from last year that yielded promising results and had some theoretical rationale: https://arxiv.org/abs/2110.12661

Unfortunately their proposed approach isn't available out-of-the-box with TF or PyTorch, but it shouldn't be too tough to implement by hand if you have the time.

SufficientStautistic t1_iyq8qsw wrote on December 3, 2022 at 8:30 AM

Reply to [R] SGD augmented with 2nd order information from seen sequence of gradients? by jarekduda

Levenberg-Marquardt

SufficientStautistic t1_iyngd0t wrote on December 2, 2022 at 6:23 PM

Reply to comment by mileseverett in [D] PyTorch 2.0 Announcement by joshadel

God forbid a deep learning framework would not be backwards compatible right lol