Single_Blueberry
Single_Blueberry t1_jdnyc2d wrote
Reply to comment by MassiveIndependence8 in [D] I just realised: GPT-4 with image input can interpret any computer screen, any userinterface and any combination of them. by Balance-
Hmm I don't know. It's pretty bad at getting dead-on accurate results, but in many cases the relative error of the result is pretty low.
Single_Blueberry t1_jdhtc58 wrote
Reply to comment by BinarySplit in [D] I just realised: GPT-4 with image input can interpret any computer screen, any userinterface and any combination of them. by Balance-
What would keep us from just telling it the screen resolution and origin and asking for coordinates?
Or asking for coordinates in fractional image dimensions.
Single_Blueberry t1_jcjvh6o wrote
Reply to comment by Available_Lion_652 in [D] GPT-4 is really dumb by [deleted]
Again, can't find a reliable source for that.
I personally doubt that GPT-4 is significantly larger than GPT 3.x, simply because that would also further inflate inference cost, which you generally want to avoid in a product (as opposed to a research feat).
Better architecture, better RLHF, more and better train data, more train compute? Seems all reasonable.
Orders of magnitudes larger again? Don't think so.
Single_Blueberry t1_jcjsxa1 wrote
Reply to comment by Available_Lion_652 in [D] GPT-4 is really dumb by [deleted]
>the fact that GPT 4 may be two magnitude orders bigger than GPT 3
I'm not aware of any reliable sources that claim that.
Intuitively I don't see why it would stop hallucinating. I imagine the corpus - as big as it may be - doesn't contain a lot of examples for the concept of "not knowing the answer".
That's something people use a lot in private conversation, but not in written language on the public internet or books. Which afaik is where most of the data comes from.
Single_Blueberry t1_jcjr43p wrote
Reply to [D] GPT-4 is really dumb by [deleted]
Yes, it's well-known that current language models are pretty bad at math
Single_Blueberry t1_j6x9v3s wrote
>Does this mean that only well-funded corporations will be able to train general-purpose LLM
No, they are just always a couple years ahead.
That's not just a thing with language models, or even ML, it's like that with many technologies.
Single_Blueberry t1_jdnz55p wrote
Reply to Do we really need 100B+ parameters in a large language model? by Vegetable-Skill-9700
Just because there's a more efficient architecture doesn't mean it will also benefit further from inreasing it's size.
"We" didn't build a 100B+ parameter model because that's exactly what we need, but because that's the current limit of what we can do.