cipri_tom t1_jcjr8rn wrote
Reply to comment by cipri_tom in [R] RWKV 14B ctx8192 is a zero-shot instruction-follower without finetuning, 23 token/s on 3090 after latest optimization (16G VRAM is enough, and you can stream layers to save more VRAM) by bo_peng
Man, ChatRNN
The stars would be pouring over the repo if you named it ChatRNN. People love an antagonist, and "going back to the old days" and proving that was better
bo_peng OP t1_jcjuejz wrote
ChatRNN is indeed a great name :)
R W K V are the four major parameters in RWKV (similar to QKV for attention).
I guess you can pronounce it like "Rwakuv" (A bit like racoon)
Viewing a single comment thread. View all comments