MysteryInc152 OP t1_jcputc0 wrote on March 18, 2023 at 5:09 PM

Uses relative positional encoding. Long context in theory but because it was trained on 2048 tokens of context, performance gradually declines after that. Finetuning for more context wouldn't be impossible though.

You can run with FP-16 (13GB RAM), 8-bit(10GB) and 4-bit(6 GB) quantization.

Temporary-Warning-34 t1_jcpwx16 wrote on March 18, 2023 at 5:23 PM

RP isn't forever, though.

MysteryInc152 OP t1_jcpxcn5 wrote on March 18, 2023 at 5:26 PM

Oh for sure. Changed it to long context, i think that's better. I just meant there's no hard context limit.