Comments
tripple13 t1_it1vs7l wrote
Being pedantic is at least not a pre-requisite.
Normalization is just centering and standardizing the data. Which these researchers are fully aware of.
Does that mean you suddenly transform Poisson distributed data into Gaussian? No.
Is it a big mistake to name it as such, ahhh, I don't know. Is it a measure of their mathematical ability? No definitely not.
Does it tell something about the level of pedacticity (i don't even know if that's a word) of the person? Maybe.
I'd argue becoming successful in this field you can go many ways, one of them may be very specific and T-person oriented (like measure-theory for instance), other ways may be more rounded and broad based. Whatever works for you.
kfmfe04 t1_it1m1g3 wrote
He standardized the distribution from (mu, sigma) to (0,1). This is basic. (without looking at the links)
rehrev t1_it1prmk wrote
Yea that's basic and unrelated to gaussianness
[deleted] OP t1_it1ku5u wrote
[removed]
fasttosmile t1_it243op wrote
You skimmed through multiple one hour long videos?
lmao
impossiblefork t1_it251or wrote
Neither he nor you misunderstood anything. Using expressions freely is neither a sign of ability nor its absence. Perhaps it's a sign that you can think fairly freely about things in this field, perhaps it's a sign that there's room for less ad-hoc and more precise thinking.
Fabulous-Nobody- t1_it2g1um wrote
I think the confusion comes from the fact that normal and Gaussian are synonyms in many (but not all) contexts relating to probability theory. People then confuse normal and normalized, which leads to statements such as those in the videos.
To answer your question: no, many successful researchers in ML are computer scientists or engineers with no rigorous understanding of probability theory and statistics.
float16 t1_it1mphe wrote
I looked at it. He misspoke and probably meant that batch normalization makes the preactivations closer to the domain of the activation function (tanh in this case) where the derivative is far from 0.
Also, "Gaussian" is often used to refer to the standard normal distribution.
Same kind of deal when lots of people say "convolution" when they mean "cross correlation."
To answer your question, no, it is not necessary, but good researchers often have a solid math foundation.