[D] Why isn't everyone using RWKV if it's so much better than transformers? Submitted by ThePerson654321 t3_11lq5j4 on March 8, 2023 at 7:52 AM in MachineLearning 21 comments 16
Nameless1995 t1_jbkdy9h wrote on March 9, 2023 at 5:54 PM I have seen some recent papers comparing against RWKV: https://arxiv.org/abs/2302.10866 https://arxiv.org/abs/2302.13939 Permalink 4
Viewing a single comment thread. View all comments