TFenrir t1_j6e9idu wrote on January 29, 2023 at 7:04 PM

This makes a lot of sense.

A lot of what instruct fine tuning and rlhf is that if you provide some high quality, specifically created data to an LLM while it's being fine tuned, you get a significant jump in results for this fine tuned model - versus just giving them more of the same structured data.

In some of the papers I read, a lot of the conclusions are akin to "next steps is trying to see if more instruction data will improve results".

Some of the challenges with this instruction data is that well we just don't have a lot. We don't have for example... A lot of the recordings of people using computers to complete tasks. Like keystrokes and screen recording.

I don't think this sounds like they are getting "screen" recordings (AdeptAI for example is doing that with their model, but with a browser only for now). It sounds more like just accompanying natural language descriptions with the fine tuned data is enough to get an improvement. Which makes sense from my limited experience with LLMs.

Should be interesting. I imagine this is for fine tuning GPT4. The "Codex 2.0", better base model (GPT), better instruct tuning probably as well.