Blacky372 t1_jdl62vl wrote on March 25, 2023 at 4:58 AM

GPT-J-6B with instruction finetuning will surely not ever be better than GPT-4. With RLHF you may reach a similar response quality in some contexts for some types of instruction, but you will never match the vast amounts of proprietary data that ClosedAI fed into a probably 250+B parameter model with specialized expert data from literally 50 experts in various fields that worked on the response quality in their domain. This cannot be surpassed easily, unfortunately. But maybe future open source models will be of similar capabilities with advanced training techniques. I would definitely hope so.

blueSGL t1_jdl756z wrote on March 25, 2023 at 5:10 AM

> with specialized expert data from literally 50 experts in various fields that worked on the response quality in their domain.

Sounds like a future goal for Open Assistant.

If one were being unethical... create a bot to post the current Open Assistant answers to technical questions in small specialist subreddits and wait for Cunningham's_Law to come into effect. (I'm only half joking)

atheist-projector t1_jdm7mmi wrote on March 25, 2023 at 1:05 PM

I love the odea of calling them closedai.

Thats it j am doing it from now on

WonderFactory t1_jdm4pk1 wrote on March 25, 2023 at 12:37 PM

How long though before LLMs perform at the same level as experts in a most fields? A year, two, three? When you get to that point you can generate synthetic data that's the same quality as human produced data. The Reflexion paper mentioned in another thread claims that giving GPT 4 the ability to test the output of its code produces expert level coding performance. This output could be used to train an open source model.

blose1 t1_jdoj8kl wrote on March 25, 2023 at 11:25 PM

GPT models struggle with out of distribution programming tasks, which means it can't create novel ideas, I tested this myself many times and it's not a prompt engineering issue. I think LLMs could act as great teachers but not researchers, teachers just teach what we already know, researchers create novel knowledge that teachers use.

Vegetable-Skill-9700 OP t1_jdl8hh5 wrote on March 25, 2023 at 5:26 AM

Agreed, it won't generalize as well as GPT-4, but it could achieve similar performance for a specialized task (say answering technical questions around a certain topic or writing social media posts for a certain entity, etc.).

zbyte64 t1_jdmvaak wrote on March 25, 2023 at 4:07 PM

Sounds like we're realizing that a model is only as good as the experts that wrote the training data.