luxsteele t1_jb1b68d wrote
Reply to comment by _Arsenie_Boca_ in [R] RWKV (100% RNN) can genuinely model ctx4k+ documents in Pile, and RWKV model+inference+generation in 150 lines of Python by bo_peng
Totally agree.
I have been following this from some time but I can't fully understand it and explain it to my collaborators.
I work in ML and I have quite some experience with transformers and I still can't fully get it. Let alone convince some of my collaborator that is worth pursuing it.
It is paramount that we have a paper that explains this in more detail if we want the community to consider this seriously.
Please do it!
bo_peng OP t1_jb1q5fu wrote
Yes a paper is coming. Meanwhile you can read https://arxiv.org/abs/2302.13939 (SpikeGPT) which is inspired by RWKV and has plenty of explanations :)
Viewing a single comment thread. View all comments