maxToTheJ t1_izvm3wq wrote on December 12, 2022 at 4:15 AM

Reply to comment by farmingvillein in [D] - Has Open AI said what ChatGPT's architecture is? What technique is it using to "remember" previous prompts? by 029187

Dude the freaking logs on chrome show OpenAI concats the prompts.

>You then proceeded to pull a tweet within that thread which was entirely irrelevant

your exact words. Try standing by them

> (other than being a lower bound).

A lower bound is relevant its basic math. Freaking proofs are devoted to setting lower bounds

I am still waiting on any proof of any extraordinary for a GPT3 type model memory . Since it is extremely relevant for explaining something ,is to know it exist in the first place

farmingvillein t1_izvnwdh wrote on December 12, 2022 at 4:31 AM

...the whole twitter thread, and my direct link to OpenAI, are about the upper bound. The 822 number is irrelevant (given that OpenAI itself tells us that the window is much longer), and the fact that you pulled it tells me that you literally don't understand how transformers or the broader technology works, and that you have zero interest in learning. Are you a Markov chain?

maxToTheJ t1_izvotec wrote on December 12, 2022 at 4:39 AM

> The 822 number is irrelevant (given that OpenAI itself tells us that the window is much longer)

OpenAI says the "cache" is '3000 words (or 4000 tokens)". I dont see anything about the input being that. The test case the poster in the twitter thread with spanish is indicative of input being the lower bound which also aligns with what the base GPT3.5 model in the paper has. The other stress test was trivial

https://help.openai.com/en/articles/6787051-does-chatgpt-remember-what-happened-earlier-in-the-conversation

> ...the whole twitter thread, and my direct link to OpenAI, are about the upper bound.

Details. No hand wavy shit, explain with examples why its longer especially since your position is some magical shit not in the paper/blog is happening.

farmingvillein t1_izvq3i8 wrote on December 12, 2022 at 4:51 AM

> I dont see anything about the input being that.

Again, this has absolutely nothing to do with the discussion here, which is about memory outside of the prompt.

Again, how could you possibly claim this is relevant to the discussion? Only an exceptionally deep lack of conceptual understanding could cause you to make that connection.

maxToTheJ t1_izvqh2f wrote on December 12, 2022 at 4:54 AM

This is boring. I am still waiting on those details.

No hand wavy shit, explain with examples showing its impressively longer especially since your position is some magical shit not in the paper/blog is happening.