Single_Blueberry t1_jdnz55p wrote on March 25, 2023 at 8:52 PM

Reply to Do we really need 100B+ parameters in a large language model? by Vegetable-Skill-9700

Just because there's a more efficient architecture doesn't mean it will also benefit further from inreasing it's size.

"We" didn't build a 100B+ parameter model because that's exactly what we need, but because that's the current limit of what we can do.

Single_Blueberry t1_jdnyc2d wrote on March 25, 2023 at 8:46 PM

Reply to comment by MassiveIndependence8 in [D] I just realised: GPT-4 with image input can interpret any computer screen, any userinterface and any combination of them. by Balance-

Hmm I don't know. It's pretty bad at getting dead-on accurate results, but in many cases the relative error of the result is pretty low.

Single_Blueberry t1_jdhtc58 wrote on March 24, 2023 at 2:28 PM

Reply to comment by BinarySplit in [D] I just realised: GPT-4 with image input can interpret any computer screen, any userinterface and any combination of them. by Balance-

What would keep us from just telling it the screen resolution and origin and asking for coordinates?

Or asking for coordinates in fractional image dimensions.

Single_Blueberry t1_jcjvh6o wrote on March 17, 2023 at 11:09 AM

Reply to comment by Available_Lion_652 in [D] GPT-4 is really dumb by [deleted]

Again, can't find a reliable source for that.

I personally doubt that GPT-4 is significantly larger than GPT 3.x, simply because that would also further inflate inference cost, which you generally want to avoid in a product (as opposed to a research feat).

Better architecture, better RLHF, more and better train data, more train compute? Seems all reasonable.

Orders of magnitudes larger again? Don't think so.

Single_Blueberry t1_jcjsxa1 wrote on March 17, 2023 at 10:39 AM

Reply to comment by Available_Lion_652 in [D] GPT-4 is really dumb by [deleted]

>the fact that GPT 4 may be two magnitude orders bigger than GPT 3

I'm not aware of any reliable sources that claim that.

Intuitively I don't see why it would stop hallucinating. I imagine the corpus - as big as it may be - doesn't contain a lot of examples for the concept of "not knowing the answer".

That's something people use a lot in private conversation, but not in written language on the public internet or books. Which afaik is where most of the data comes from.

Single_Blueberry t1_jcjr43p wrote on March 17, 2023 at 10:16 AM

Reply to [D] GPT-4 is really dumb by [deleted]

Yes, it's well-known that current language models are pretty bad at math

Single_Blueberry t1_j6x9v3s wrote on February 2, 2023 at 3:29 PM

Reply to [D]How Will Open Source Alternatives Compete With GPT3? by noellarkin

>Does this mean that only well-funded corporations will be able to train general-purpose LLM

No, they are just always a couple years ahead.

That's not just a thing with language models, or even ML, it's like that with many technologies.