Submitted by ThePerson654321 t3_11lq5j4 in MachineLearning
ThePerson654321 OP t1_jbk6nb4 wrote
Reply to comment by farmingvillein in [D] Why isn't everyone using RWKV if it's so much better than transformers? by ThePerson654321
Thanks! I also find it very unlikely that nobody from a large organisation (Openai, Microsoft, Google Brain, Deepmind, Meta, etc) would have noticed it.
farmingvillein t1_jbk819k wrote
I think it is more likely people have seen it, but dismissed it as a bit quixotic, because the RWKV project has made little effort to iterate in an "academic" fashion (i.e., with rigorous, clear testing, benchmarks, goals, comparisons, etc.). It has obviously done pieces of this, but hasn't been sufficiently well-defined as to make it easy for others to iterate on top of it, from a research POV.
This means that anyone else picking up the architecture is going to have to go through the effort to create the whole necessary research baseline. Presumably this will happen, at some point (heck, maybe someone is doing it right now), but it creates a large impediment to further iteration/innovation.
Viewing a single comment thread. View all comments