netw0rkf10w
netw0rkf10w OP t1_j71r8e8 wrote
Reply to comment by CyberDainz in [D] ImageNet normalization vs [-1, 1] normalization by netw0rkf10w
Any references?
netw0rkf10w OP t1_j6zbfz4 wrote
Reply to comment by MadScientist-1214 in [D] ImageNet normalization vs [-1, 1] normalization by netw0rkf10w
Indeed. Maybe we have a new battle between [-1, 1] and [0, 1] lol.
netw0rkf10w OP t1_j6zbbkb wrote
Reply to comment by nicholsz in [D] ImageNet normalization vs [-1, 1] normalization by netw0rkf10w
Agreed!
netw0rkf10w OP t1_j6zb957 wrote
Reply to comment by puppet_pals in [D] ImageNet normalization vs [-1, 1] normalization by netw0rkf10w
If I remember correctly it was first used in AlexNet, which started the deep learning era though. I agree that it doesn't make much sense nowadays, but it's still be used everywhere :\
netw0rkf10w OP t1_j6z15t0 wrote
Reply to comment by nicholsz in [D] ImageNet normalization vs [-1, 1] normalization by netw0rkf10w
I think normalization will be here to stay (maybe not the ImageNet one though), as it usually speeds up training.
netw0rkf10w OP t1_j6z0oia wrote
Reply to comment by melgor89 in [D] ImageNet normalization vs [-1, 1] normalization by netw0rkf10w
So no noticeable difference in performance in your experiments?
Submitted by netw0rkf10w t3_10rtis6 in MachineLearning
netw0rkf10w OP t1_j2939o2 wrote
Reply to comment by TimDarcet in [D] What are the strongest plain baselines for Vision Transformers on ImageNet? by netw0rkf10w
You are right, indeed. Not sure why I missed that. I guess one can conclude that DeiT 3 is currently SoTA for training from scratch.
netw0rkf10w OP t1_j0gcgxy wrote
Reply to comment by TimDarcet in [D] What are the strongest plain baselines for Vision Transformers on ImageNet? by netw0rkf10w
Thanks. DeiT is actually a very nice paper from which one can learn a lot of things. But the training regimes that they used seem a bit long to me: 300 to 800 epochs. The authors of MAE managed to achieve 82.3% for ViT-B after only 100 epochs, so I'm wondering if anyone in the literature has ever been able to match that.
Submitted by netw0rkf10w t3_zmpdo0 in MachineLearning
netw0rkf10w t1_iyzid5m wrote
Reply to comment by marcodena in [D] PyTorch 2.0 Announcement by joshadel
That's a good point. Though it's still unclear to me why that would result in no speedup.
netw0rkf10w t1_iyqbmun wrote
Reply to [D] PyTorch 2.0 Announcement by joshadel
The new compiler is so cool!!
Though virtually no speed-up on ViT: https://pbs.twimg.com/media/Fi_CUQRWQAAL-rf?format=png&name=large. Anyone has an idea on why?
netw0rkf10w t1_ivgnfd3 wrote
Reply to comment by Competitive_Dog_6639 in [D] Git Re-Basin Paper Accused of Misinformation by fryingnem0
Could you comment on part A, B, and D? Let's consider the review in its integrality.
netw0rkf10w t1_ived3vs wrote
The paper is accused of being simply a rehash of previous work (which is much stronger than "misleading (presentation of) contributions"). The accuser supported his claim with detailed technical arguments, which I find to be rather convincing, but of course I would prefer to hear from the authors and especially from other experts before drawing any conclusions.
In general I believe that "misleading contributions" should not be tolerated in academic research.
Whatever the results will turn out, I love the openness of ICLR. There is a paper accepted at NeurIPS 2022 that is presented in a quite misleading manner (even though related work had been privately communicated to the authors via email during the review process). I would have loved to post a comment not to accuse of anything but to point out previous work and provide technical clarifications that I think would be beneficial to the readers (including the reviewers). Unfortunately this is not possible.
P/s: Some previous comments question the use of the word "misinformation". I would have used "misleading" (which is more common in academia, but perhaps a bit light if the accusation is true), though I don't feel too much difference when hearing "misinformation" over "misleading" (being a non-native English speaker). According to Oxford Dictionary, they are more or less the same:
>misinformation: the act of giving wrong information about something; the wrong information that is given
>
>misleading: giving the wrong idea or impression and making you believe something that is not true
The point here is that the accuser may not be a native English speaker either, and thus his technical arguments should not be overlooked because of this wording.
netw0rkf10w t1_j7u86nu wrote
Reply to [P] Get 2x Faster Transcriptions with OpenAI Whisper Large on Kernl by pommedeterresautee
The work is amazing and the post is very informative. Thanks!