BackgroundFeeling707
BackgroundFeeling707 t1_j05lywf wrote
Reply to [D] Simple Questions Thread by AutoModerator
Hi, How do local language model inferencing such as kobold ai's webui keep information? I understand you can only produce a certain number of tokens in one go.
Does it just use the last 30 tokens or so in the new batch?
Eventually. I run out of memory.. Unable to continue the text adventure. It Shouldn't do that right?
Are there techniques to store info?
BackgroundFeeling707 t1_irgnvop wrote
Reply to [R] Google AudioLM produces amazing quality continuation of voice and piano prompts by valdanylchuk
When can we play with such a thing?
BackgroundFeeling707 t1_j05mhuh wrote
Reply to [D] Simple Questions Thread by AutoModerator
In general, for stable diffusion, why are there often large vram spikes at the end of inferencing, and what kind of code techniques are done to solve this problem?