Comments
Hands0L0 t1_j9i277j wrote
I for one welcome competition in the race to AGI
xott t1_j9i2zg3 wrote
It's the new Space Race
fumblesmcdrum t1_j9ib2lt wrote
latent space race
Hands0L0 t1_j9jqo4a wrote
Fuck dude, that's clever
phoenixmusicman t1_j9imtgj wrote
It's not a space race until Governments start pouring massive amounts of their GDP into it.
Artanthos t1_j9l2n4a wrote
China is pouring money into AI research.
spreadlove5683 t1_j9i58z2 wrote
I sure don't. We need to get it right. Not barrel ahead in an arms race.
amplex1337 t1_j9ir5dt wrote
You know the closer we get to AGI the more that will happen. Every government will want to be the first in control of an ASI which will basically make them the dominant superpower of the world. It will be as dystopian as it sounds.
[deleted] t1_j9j6qqt wrote
[deleted]
beachmike t1_j9k5205 wrote
There will be both good and bad that comes out as we get closer to AGI, and attain AGI, just like and other technological revolution. To paint it as "either" dystopian or utopian is naive.
Artanthos t1_j9l3i0n wrote
It depends. We cannot see the other side of a singularity.
We could have an alignment issue and end up as paper clips.
AI could solve everything from climate change to social inequality by reducing the human race to 50 million Stone Age hunter gatherers.
Or, you could have the top 1% living in a utopia while everyone else is living in a dystopia.
Ziggy5010 t1_j9j1id1 wrote
Agreed
dangeratio t1_j9igzb4 wrote
Check out Amazon’s multimodal chain of thought model, only 738 million and scores better on all question classes than ChatGPT. See table 4 on page 7 here - https://arxiv.org/pdf/2302.00923.pdf
Destiny_Knight t1_j9iupzk wrote
What the actual fuck is that paper? The thing performed better than a human at several different question classes.
At fucking less than one billion parameters. 100x less than GPT 3.5.
Edit: For clarity, I am impressed not angry lol.
IluvBsissa t1_j9j5t08 wrote
Are you angry or impressed ?
Destiny_Knight t1_j9j6iq0 wrote
impressed lol
IluvBsissa t1_j9j6v5v wrote
If these models are so smol and efficient, why are they not released ?? I just don't get it. I thought PaLM was kept private because it was too costly to run to be profitable...
kermunnist t1_j9kqsaw wrote
That's because the smaller models are less useful. With neural networks (likely including biological ones) there's a hard trade off between specialized performance and general performance. If these 100+x smaller models were trained on the same data as GPT-3 they would perform 100+x worse on these metrics (maybe not exactly because in this case the model was multimodal which definitely gave a performance advantage). The big reason this model performed so much is because it was fine tuned on problems similar to the ones on this exam where as GPT-3 was fine turned on anything and everything. This means that this model would likely not be a great conversationalist and would probably flounder at most other tasks GPT-3.5 does well on.
drekmonger t1_j9iios3 wrote
Heh. I tried their rationalization step with ChatGPT, just with prompting. For their question about the fries and crackers it said the problem is flawed, because there's such a thing as crackers with low or no salt. Also correctly inferred that fries are usually salted, but don't have to be. (of course, it didn't have the picture to go by, which was the point of the research)
Great paper though. Thanks for sharing.
challengethegods t1_j9i1lk3 wrote
>I'd be more impressed by a model smaller than GPT-3 that performed just as well.
from the article: "Aleph Alpha’s model is on par with OpenAI’s GPT-3 davinci model, despite having fewer parameters.", so... you're saying you would be even more impressed if it used even fewer parameters? Anyway I think anyone could guess gpt3 is poorly optimized so it shouldn't be surprising to anyone that plenty of models have matched its performance on some benchmarks with less parameters.
ninadpathak t1_j9hxyog wrote
True, we've seen models a tad bit bigger than GPT3 which are so bad, even GPT 2 would blow them out the water.
Think AI21 Jurassic park or whatever they call their largest model. I hate how stupid it is
musing2020 t1_j9it8e1 wrote
Achieving GPT 175B Level Accuracy with a 10x More Efficient Model
https://sambanova.ai/blog/achieving-gpt-175b-level-accuracy-with-a-10x-more-efficient-model/
Professional-Song216 t1_j9hwijh wrote
Great way to look at it, it’s much more important to squeeze the maximum out of your system. Efficiency over excess
burnt_umber_ciera t1_j9iqusp wrote
Are you aware of either the "training material" or "training time" or "training techniques" utilized?
Zer0D0wn83 t1_j9iwxqu wrote
I'm sure they've read those papers too, you know.
ironborn123 t1_j9j2512 wrote
All else being equal, number of model parameters does matter. Well funded startups can acquire the needed data, compute resources, and human talent to build the models. Just like how OpenAI beat Google at this game.
sonderlingg t1_j9itgq3 wrote
Artificial German Intelligence
H0sh1z0r4 t1_j9jjn56 wrote
never ask AGI what is was doing in 1939
InsideATurtlesMind t1_j9k57j3 wrote
Künstliche Allgemeine Intelligenz 🇩🇪
myusernameblabla t1_j9l2v3q wrote
He Kai, sag mal was gescheites.
Benderisgreat4 t1_j9ju7ap wrote
Surely it won't be funny...
SupportstheOP t1_j9ldmse wrote
Inb4 The Germans literally create Funny-Bot
ML4Bratwurst t1_j9hxzv8 wrote
It's not about the size;)
Twinkies100 t1_j9ik1qr wrote
r/thatswhatshesaid
bluehands t1_j9inzl3 wrote
You obviously do not know my ex.
Ylsid t1_j9i9tr4 wrote
Great, and is it open source?
ddeeppiixx t1_j9j9d79 wrote
Of course no. Unless the research is done within a University context (or publicly funded), you won't have the model open source. SD is maybe the exception, and it seems to me like they regret releasing it and are now doing whatever they can to regain control.
needle1 t1_j9j9nyr wrote
Hm? Care to elaborate on what they’re doing to “regain control?”
ddeeppiixx t1_j9jav1p wrote
First they tried to take control of the DF subreddit (Source). Apparently it was solved on good terms.
Also, newer versions are much more controlled in term of what you can generate. No more NSFW allowed, no more "famous artists" based models. They was also rumors about new license terms (not sure if it did happen actually) that essentially provide them with legal power to force users to update to a newer version (as crazy as it sounds). There is a reason that the community is still using 1.5 version over the 2.0 version.
Honestly, the way I see it, Stability AI are not doing it with bad intentions (at least I hope), and are kind of forced to do that, as they are a legal entity and have to address all the threats of legislative actions regarding explicit sexual contents and living artists.
Ylsid t1_j9jamyy wrote
Unfortunate, but I figured. Something's up when the Russians are the only ones releasing LLMs
MysteryInc152 t1_j9lj5ef wrote
The GLM models are from China and open sourced.
Ylsid t1_j9mil4h wrote
I didn't know China was doing it too! I know Russia recently open sourced one. If their tactic is to undermine western power by open sourcing their next big products, they can keep doing it
WeedWacker25 t1_j9jgt2d wrote
I had to do some research about this. I found that the transformers are mostly open source, but the trained models are not.
Ylsid t1_j9jgvso wrote
Boo
Kafke t1_j9is0o6 wrote
paywalled though and likely will be just as censored. It's also currently not available. So.... who cares?
Thorusss t1_j9nrh2c wrote
Geopolitics cares
Private_Island_Saver t1_j9j73ot wrote
Like what would happen if like 4% of global GDP was put into this?
IluvBsissa t1_j9j5rld wrote
Germany saving Europe again ! No wait..
Ortus14 t1_j9lhcci wrote
Singularity is approaching fast.
People might not realize that a sufficiently advanced LLM can simulate Ai researchers and programmers. For example, "simulate a thousand of the top Ai researchers, discussing and then programming an AGI".
Thorusss t1_j9nrlcs wrote
any sufficiently advanced LLM is indistinguishable from trueAGI™?
VeganPizzaPie t1_j9hwb1w wrote
Gesundheit
[deleted] t1_j9htnkv wrote
[removed]
No_Ninja3309_NoNoYes t1_j9jtiy0 wrote
Static parameters are meaningless. Human brains are not static until after death. Besides modeling reality requires more than a bit of algebra.
[deleted] t1_j9iin71 wrote
[deleted]
Villad_rock t1_j9k4oz6 wrote
Im from germany and I know germany is incompetent anything related to IT, its all about old economy. Don’t get any hopes up.
Honest_Science t1_j9kx3c4 wrote
That is really a nice one
Honest_Science t1_j9kx92l wrote
I tried to use their system on their playground. It takes a lot of prompting to get anything sensible out of it.
Thorusss t1_j9nro02 wrote
The German CovidApp was surprisingly solid
datsmamail12 t1_j9klf3w wrote
I'm going to make a genuine question because no one ever gave me a clear answer. When will these language models ever start to be useful?
Thorusss t1_j9nrssl wrote
They are useful now.
Two friend of mine use ChatGPT for work.
Gold-and-Glory t1_j9hyg48 wrote
And no bias?
Liktwo t1_j9io4z9 wrote
NEIN!
Thorusss t1_j9nrqp1 wrote
of there are many biases.
These neural networks mostly consistent of weights and biases.
Gold-and-Glory t1_j9o32jz wrote
Not this bias, the other that reddit agrees with and downvote you if you mention, like a religious dogma.
Akimbo333 t1_j9hutcu wrote
Interesting
Dankbubbles123 t1_j9hb2j2 wrote
Eh, doesn’t chat gpt-4 have like 1.4 trillion parameters? It dwarfs this by almost 5 times.
Edit: turns out, I was wrong! :D
Buck-Nasty t1_j9hb9az wrote
Gpt4s parameter counts aren't known yet
Dankbubbles123 t1_j9hbbp1 wrote
Ah okay, nvm then. Sorry
Buck-Nasty t1_j9hch2c wrote
The context window is apparently massive though, more than 10 times the size of gpt3, it could potentially write whole novels at that scale
https://mobile.twitter.com/transitive_bs/status/1628118163874516992?s=46&t=Biiqy66Cy9oPH8c1BL6_JQ
hydraofwar t1_j9hgim5 wrote
A credible researcher had commented that ChatGPT can write code, and GPT-4 could write entire programs.
GPT-5entient t1_j9hk7td wrote
32k tokens would mean approximately 150 kB of text. That is a decent sized code base! Also with this much context memory the known context saving tricks would work much better so this could be theoretically used to create code bases of virtually unlimited size.
This amazes me and also (being software dev) also scares me...
But, as they say, what a time to be alive!
GPT-5entient t1_j9hji5i wrote
Wow, yeah, this looks amazing. My biggest issue with GPT-3 is the relatively small context window. This will open so many new possibilities.
Practical-Mix-4332 t1_j9hg2cr wrote
Is anything about gpt4 known? It seems like just a bunch of rumors and not even a release date
Midnight-Movie t1_j9hv0t7 wrote
>Is anything about gpt4 known? It seems like just a bunch of rumors and not even a release date
I work with someone who has Beta access to GPT-4. He won't tell me much other than it's mind-blowing & that software development will never be the same. He confirms the rumors that it indeed can write an entire piece of software.
farcetragedy t1_j9hzfq1 wrote
That’s exciting. Would be amazing if the next one didn’t just make shit up when it doesn’t know the answer
Practical-Mix-4332 t1_j9hxkf3 wrote
Oh great another rumor
Midnight-Movie t1_j9hy4uz wrote
Well... You asked if anything was known. I gave you info from a coworker with beta access. My apologies if my info didn't come with a boquet of roses and a handwritten card.
Practical-Mix-4332 t1_j9i0ctk wrote
I understand you’re trying to help, but this being Reddit and all there’s no way we can trust what you are saying or take it officially as something “known”. No offense though.
MysteryInc152 t1_j9j9dvt wrote
32k context window it seems.
https://mobile.twitter.com/transitive_bs/status/1628118163874516992?s=20
GPT-5entient t1_j9hkupz wrote
There was that very popular but completely unfounded rumor about 100T param count. It was debunked by Sam Altman himself.
If you think about it for just 1 second you would realize that 100T param model would need at least 200 TB of VRAM or 2560 Nvidia A100s...
BlueMoon_Josh t1_j9hs5v1 wrote
It was 5 Morbillion parameters
bass6c t1_j9hfy9w wrote
Chatgpt is based on gpt3 a 175 billion parameters model.
drekmonger t1_j9hvs1w wrote
Number of parameters is not the whole story. Quality of training material and training time and training techniques matter as much or more.
The larger models require more resources for inference, as well. I'd be more impressed by a model smaller than GPT-3 that performed just as well.