wywywywy t1_jdlz40b wrote on March 25, 2023 at 11:37 AM

GPT-J & GPT-Neo are predecessors of GPT-NeoX 20b

michaelthwan_ai OP t1_jdlzwyi wrote on March 25, 2023 at 11:47 AM

Sure I think it is clear enough to show parents of recent model (instead of their grand grand grand parents..

If people want, I may consider to make a full one (including older one)

wywywywy t1_jdm16va wrote on March 25, 2023 at 12:01 PM

In my opinion, it'd be better to include only the currently relevant ones rather than everything under the sun.

Too much noise makes the chart less useful.

michaelthwan_ai OP t1_jdm3sgb wrote on March 25, 2023 at 12:28 PM

Agreed

Puzzleheaded_Acadia1 t1_jdlx6ea wrote on March 25, 2023 at 11:14 AM

Is gpt-j 6b really better than alpaca 7b and Wich run faster

DigThatData t1_jdmvquq wrote on March 25, 2023 at 4:10 PM

the fact that it's comparable at all is pretty wild and exciting

StellaAthena t1_jdotklz wrote on March 26, 2023 at 12:42 AM

It’s somewhat worse and a little faster.

[deleted] t1_jdlzp51 wrote on March 25, 2023 at 11:44 AM

[deleted]

michaelthwan_ai OP t1_jdlztvv wrote on March 25, 2023 at 11:46 AM

It is a good model but it's about one year ago, and not related to recent released LLM. Therefore I didn't add (otherwise a tons of good models).
For dolly, it is just ytd. I didn't have full info of it yet

addandsubtract t1_jdm1d9h wrote on March 25, 2023 at 12:03 PM

Ok, no worries. I'm just glad there's a map to guide the madness going on, atm. Adding legacy models would be good for people who come across them now, to know that they are legacy.

DigThatData t1_jdmvjyb wrote on March 25, 2023 at 4:09 PM

dolly is important precisely because the foundation model is old. they were able to get chatgpt level performance out of it and they only trained it for three hours. just because the base model is old doesn't mean this isn't recent research. it demonstrates:

the efficacy of instruct finetuning
that instruct finetuning doesn't require the worlds biggest most modern model or even all that much data

dolly isn't research from a year ago, it was only just described for the first time a few days ago.

EDIT: ok I just noticed you have an ERNIE model up there so this "no old foundation models" thing is just inconsistent.

[N] March 2023 - Recent Instruction/Chat-Based Models and their parents

addandsubtract t1_jdlvmm6 wrote on March 25, 2023 at 10:54 AM