patient_zer00

patient_zer00 t1_izuqszr wrote on December 12, 2022 at 12:08 AM

Reply to [D] - Has Open AI said what ChatGPT's architecture is? What technique is it using to "remember" previous prompts? by 029187

It doesn't remember stuff, its mostly the web app that remembers it, it sometimes resends the previous request with your current one. (Check the chrome request logs) It will then probably concatenate the prompts and feed them as one to the model.

patient_zer00 t1_iujl1if wrote on October 31, 2022 at 8:35 PM

Reply to [D] When the GPU is NOT the bottleneck...? by alexnasla

Disc IO is often a bootleneck.

Also, even though using a GPU will increase training speed with LSTMs, too, the computation of the gradient relies on the whole sequence to be processed each sequence step after the other, which can't be parallelized. That's probably why your speed increase is not that big using a K80 vs a A100.

Edit: typos