Submitted by OkayRedditHereWeGo t3_125jym5 in askscience
Hi,
So a little background. I have been working as a product manager in the ML space for a while. I enjoy the new technology and I think the new advancements done in LLM is really impressive. I know a little bit about the space, but the space is huge and I, of course, have been trying to understand things that is related to my work.
However, I feel like we are jumping to some extreme conclusions when are talking about this subject. I hear in podcasts and on linkedin that with the recent improvements we can get better performance with algorithmic generated training data, and I've seen a few papers linked. I guess it looks legit, but I have not seen anyone smart explain what that means, to a non technical person like myself. People are very eager to take that truth and jump to the next conclusion built on top of this; "with self improving algorithms we will be able to bla bla bla bla".
So, if I understand things correctly, we need to tell the model what is good knowledge (training data), and we need to give one (or more) goal function(s) to optimize for (?). In the Sam Altman (OpenAI CEO) interview by Lex Fridman, Sam clearly states that they put a lot of effort into the pre-training data set (and is very non-talkative about the composition of the data, which I guess is probably one of the main company secrets). This is the knowledge that is put in to the model, this is where we tell it what is true (and potentially what is not true). We give it a, or a few, metrics to optimize for, and it finds the correct tuning on the parameters over the epochs of training. In ChatGPT's case it is also adding a human into the loop, to fine tune the model by giving two options and letting the human select the best version, also stated in the interview.
If we conclude that we can generate better algorithmic output with algorithmically generated training data, what does that mean? To me it sounds like we can squeeze a little more performance of the model, not that we can feed the models infinitely, with better and better training data that it has produced. What is that data, even? What happened to the truths we fed the model?
[deleted] t1_je63fj5 wrote
[removed]