BackgroundFeeling707

BackgroundFeeling707 t1_j05lywf wrote

Hi, How do local language model inferencing such as kobold ai's webui keep information? I understand you can only produce a certain number of tokens in one go.

Does it just use the last 30 tokens or so in the new batch?

Eventually. I run out of memory.. Unable to continue the text adventure. It Shouldn't do that right?

Are there techniques to store info?

1