MysteryInc152 OP t1_jcputc0 wrote
Uses relative positional encoding. Long context in theory but because it was trained on 2048 tokens of context, performance gradually declines after that. Finetuning for more context wouldn't be impossible though.
You can run with FP-16 (13GB RAM), 8-bit(10GB) and 4-bit(6 GB) quantization.
Temporary-Warning-34 t1_jcpwx16 wrote
RP isn't forever, though.
MysteryInc152 OP t1_jcpxcn5 wrote
Oh for sure. Changed it to long context, i think that's better. I just meant there's no hard context limit.
Viewing a single comment thread. View all comments