ReginaldIII t1_ixi0bfl wrote on November 23, 2022 at 4:20 PM

Look up how often people like LeCun actively avoid citing his "most cited papers in the field" out of little more than unprofessional spite.

> it'd be much better if he was actually encouraging people to adopt them the proper way

He is and does. That's literally why they are highly cited papers in the first place.

His argument for not being cited isn't against the wider community who do cite him. It's against the major players who actively refuse to cite him.

mcbainVSmendoza t1_ixjdvwa wrote on November 23, 2022 at 9:47 PM

"but only if you really squint" Bingo. That's what feels so petty to me. That's where you really see ego behind the wheel.

crouching_dragon_420 t1_ixi9vba wrote on November 23, 2022 at 5:23 PM

Have you ever read some random RNN paper from LeCun's group and noticed they didn't cite the LSTM paper but instead cited the GRU paper, which is a watered-down version of the LSTM?

new_name_who_dis_ t1_ixie59m wrote on November 23, 2022 at 5:51 PM

GRU cites LSTM paper so it's fine imo, especially if they're using the GRU architecture and not the LSTM architecture.

Citing the original LSTM paper is kind of dumb in general since the modern LSTM architecture is not the one described in the paper. You really need to cite one of the latter papers that introduced the Forget gate, if you are using the default LSTM implementation.

crouching_dragon_420 t1_ixilxbs wrote on November 23, 2022 at 6:41 PM

That's total horseshit when the architecture in the paper is almost the same as the original LSTM. I'm not talking about modern papers. If they cite GRU, they should cite LSTM as well. I dont agree with the saying GRU cite LSTM so it's fine to cite GRU but not LSTM. That's shouldnt be how credit assignment work.

DigThatData t1_ixinfbc wrote on November 23, 2022 at 6:51 PM

> If they cite GRU, they should cite LSTM as well.

that's not how citations work...

> GRU cite LSTM so it's fine to cite GRU but not LSTM.

but that's literally how citations work. If you cite paper X, you are implicitly citing everything that paper X cited as well. citation graphs are transitive.

new_name_who_dis_ t1_ixiofup wrote on November 23, 2022 at 6:57 PM

Yea exactly. If you’re citing a paper you’re implicitly citing all of the papers that paper cited.

No one is citing the original perceptron paper even though pretty much every deep learning paper uses some form of a perceptron. Because the citation is implied going from more complex architectures cited, to simpler ones those cited, and so on until you get to perceptron.

alwayslttp t1_ixj9zpy wrote on November 23, 2022 at 9:21 PM

All metrics are stacked massively in favour of first level citations - many entirely ignore second level and beyond. For example, a paper's "cited by" count is its most prominent metric of influence/importance, and is a count of how many papers directly cite it.

I don't know this particular beef, but it sounds like citing GRU and not LSTM is a potential sleight/insult here. Exactly the kind of thing you see in petty academic rivalries. You're explicitly deciding who you're crediting with the key innovations you're building from, and you know that most people aren't chasing every sub reference of every citation.

DigThatData t1_ixkghj5 wrote on November 24, 2022 at 2:47 AM

sounds like the problem here is the metrics then. which also is something I'm pretty sure only even became a thing extremely recently. For a long time, the only citation-based metric anyone talked about was their Erdos number, which was a tongue-in-cheek thing anyway. Concern over metrics like this is more likely than not going to damage research progress by encouraging gamification. The only "cited by" count I ever concern myself with is for sorting stuff on google scholar, which I never presume is an exact count or directly maps to the sorting I really need.

[D] Schmidhuber: LeCun's "5 best ideas 2012-22” are mostly from my lab, and older

new_name_who_dis_ t1_ixhxgdg wrote on November 23, 2022 at 4:01 PM