Competitive_Dog_6639 t1_jcbhoi1 wrote on March 15, 2023 at 5:25 PM

Reply to [D] What do people think about OpenAI not releasing its research but benefiting from others’ research? Should google meta enforce its patents against them? by [deleted]

NLP researchers are breathing a massive sigh of release bc if GPT4 is unpublished they dont need to include it in benchmarks for their new papers 😆

Competitive_Dog_6639 t1_j8swbzz wrote on February 16, 2023 at 6:48 PM

Reply to comment by bernhard-lehner in [D] Lion , An Optimizer That Outperforms Adam - Symbolic Discovery of Optimization Algorithms by ExponentialCookie

EVolved sign momEntum (EVE) 🤣

Competitive_Dog_6639 t1_j8qa7em wrote on February 16, 2023 at 4:17 AM

Reply to [D] Lion , An Optimizer That Outperforms Adam - Symbolic Discovery of Optimization Algorithms by ExponentialCookie

ML acronyms are getting out of hand, just use any letter from any of the words I guess...

Competitive_Dog_6639 t1_j42pzbq wrote on January 12, 2023 at 7:41 PM

Reply to [D] Can someone point to research on determining usefulness of samples/datasets for training ML models? by HFSeven

Not exactly what you are describing with the A, B, C groups, but this recent paper examines ways to prune data by introducing a way to measure how useful samples are: https://openreview.net/forum?id=UmvSlP-PyV

Competitive_Dog_6639 t1_j30hjmd wrote on January 5, 2023 at 5:09 AM

Reply to [R] Measuring similarity between different vectors using Mahalanobis distance by eiliya_20

The situation you are describing is not possible in theory. If a matrix is PSD and invertible, it must be positive definite. And the inverse of a positive definite matrix is also positive definite, which means it must only yield positive Mahalonois distances (or zero if the vectors are identical). https://math.stackexchange.com/questions/2288067/inverse-of-a-symmetric-positive-definite-matrix

In practice, this might happen due to small eigenvalues and numerical error. The easiest fix is to add the identity scaled by a small constant, like in ridge regression, as others suggest

Competitive_Dog_6639 t1_j0jwrt1 wrote on December 17, 2022 at 5:03 AM

Reply to comment by Ok-Teacher-22 in [R] Silent Bugs in Deep Learning Frameworks: An Empirical Study of Keras and TensorFlow by Ok-Teacher-22

Still some annoying bugish things in torch for sure, like this shuffle error (the fix isn't very satisfying and it's easy to overlook): https://github.com/pytorch/pytorch/issues/31771

Competitive_Dog_6639 t1_iz2oaue wrote on December 6, 2022 at 12:42 AM

Reply to [R] The Forward-Forward Algorithm: Some Preliminary Investigations [Geoffrey Hinton] by shitboots

Hinton is awesome and really enjoyed his neurips talk. Naive question: are single layer gradients biologically plausible? My understanding is that gradients back thru multiple layers are not. The FF algorithm still uses gradients for single layers tho right?

Competitive_Dog_6639 t1_ivfrjh9 wrote on November 7, 2022 at 5:27 PM

Reply to [D] Git Re-Basin Paper Accused of Misinformation by fryingnem0

My take: the prior work mentioned doesnt undermine the main claims of the paper, which is that without retraining one can find permutations to map nets to the same basin.

The objection to this point is raise in part C) by the commenter, where an appeal is made to the commenters own paper. I read that and didn't see explicit ideas related to connected modes in the commenters paper. Plus, the commenters paper retains the nets, which is against the main idea of git reason. While the ideas of mode connectivity may be latent, they are not mentioned at all. Why is it the job of the git rebasin authors to dig so deeply into one out of thousands of related paper out there to give the commenter credit for an idea that isn't even explicitly discussed? Would also like to point out that the commenters paper might be missing related references to things like SWA, so maybe nobodys perfect?

Even if many of the methods come from previous work, I dont see anything to undermine the central claim of git rebasin, and for me that's that, it's an original and important idea. Could it use better relations to previous work? Sure.