Viewing a single comment thread. View all comments

[deleted] t1_jclws3b wrote 2 years ago

Reply to comment by yehiaserag in [R] RWKV 14B ctx8192 is a zero-shot instruction-follower without finetuning, 23 token/s on 3090 after latest optimization (16G VRAM is enough, and you can stream layers to save more VRAM) by bo_peng

[deleted]

yehiaserag t1_jcm31zk wrote 2 years ago

We say RWKV for short, the rest of the stuff is for a specific version

[deleted] t1_jcs3icv wrote 2 years ago

[removed]

[deleted] t1_jcs3lrc wrote 2 years ago

[removed]