granddaddy
granddaddy OP t1_j05pe9j wrote
Reply to comment by rafgro in [D] Getting around GPT-3's 4k token limit? by granddaddy
Very helpful. Appreciate the link. Is that your repo?
granddaddy OP t1_j051ykd wrote
Reply to comment by rafgro in [D] Getting around GPT-3's 4k token limit? by granddaddy
I'm having a hard time wrapping my head around this. Do you think you could elaborate further? Do you have a github repo by chance?
granddaddy OP t1_izyu345 wrote
Reply to comment by Acceptable-Cress-374 in [D] Getting around GPT-3's 4k token limit? by granddaddy
this sounds like the right answer (and sth i need to keep in mind as well)
just as an FYI, this is one answer i found from a twitter thread
- the data that needs to be fed into the model is divided into chunks
- when a user asks a question, each of these chunks (likely less than 4k tokens) is reviewed
- when there is a section of the chunk that is relevant, that section is combined with the user question
- this combined text is fed as prompt, and GPT-3 is able to answer the user's question
there's a prebuilt openai notebook you can use to replicate it
granddaddy OP t1_izytv9p wrote
Reply to comment by rafgro in [D] Getting around GPT-3's 4k token limit? by granddaddy
I found this twitter thread that may hold the answer (or at least one way to do it)
- the data that needs to be fed into the model is divided into chunks
- when a user asks a question, each of these chunks (likely less than 4k tokens) is reviewed
- when there is a section of the chunk that is relevant, that section is combined with the user question
- this combined text is fed as prompt, and GPT-3 is able to answer the user's question
overall, it sounds similar to what you have done, but i wonder how much the computational load changes
there's a prebuilt openai notebook you can use to replicate it
Submitted by granddaddy t3_zjf45w in MachineLearning
granddaddy t1_j47hbby wrote
Reply to comment by chimp73 in [D] Bitter lesson 2.0? by Tea_Pearce
This guy makes a similar comparison in his blog but goes into a bit more detail than the tweet.
https://trees.substack.com/p/false-dichotomy-and-disillusion-in
Is it worth creating your own models or extensively fine-tuning foundational models? Probably not.