blueSGL t1_jcjga2i wrote
Is it possible to split the model and do inference across multiple lower VRAM GPUs or does a single card have to have the minimum 16gig VRAM?
bo_peng OP t1_jcjuhix wrote
Yes ChatRWKV v2 supports that :)
Take a look at the "strategy" guide: https://pypi.org/project/rwkv/
Viewing a single comment thread. View all comments