Viewing a single comment thread. View all comments

mikljohansson t1_jckedf9 wrote on March 17, 2023 at 2:00 PM

Very interesting work! I've been following this project for a while now

Can I ask a few questions?

What's the difference between RWKV-LM and ChatRWKV, e.g. is ChatRWKV mainly RWKV-LM but streamlined for inference and ease of use, or is there more differences?
Are you planning to fine tune on the Stanford Alpaca dataset (like was recently done for LLaMa and GPT-J to create instruct versions of them), or a similar GPT-generated instruction dataset? I'd love to see a instruct-tuned version of RWKV-LM 14B with a 8k+ context len!

bo_peng OP t1_jcmajpx wrote on March 17, 2023 at 9:26 PM

RWKV-LM is now mainly for training, while ChatRWKV is for optimal inference.
Someone in RWKV Discord tried it using LoRA (https://github.com/Blealtan/RWKV-LM-LoRA) and the result is quite nice. Join RWKV Discord for latest updates :)