Submitted by ravik_reddit_007 t3_zzitu1 in technology
Comments
mishap1 t1_j2cd6gg wrote
$6M in A100 GPUs plus all the hardware necessary to run them. Seems totally manageable.
DaffyDogModa t1_j2cd9zs wrote
Maybe need to split that between a couple of cards
Admirable_Royal_5119 t1_j2cj5co wrote
80k per year if you use aws
shogditontoast t1_j2cps4g wrote
Wow I’m surprised it’s so cheap. Now I regret working to reduce our AWS bill as that 80k would’ve previously gone unnoticed spread over a year.
[deleted] t1_j2cyocb wrote
[removed]
username4kd t1_j2coq39 wrote
How would a Cerebras CS2 do?
SoylentRox t1_j2f9pw8 wrote
I think the issue is the cerebras has only 40 gigabytes of SRAM.
Palm is 540 billion parameters - that's 2.160 terabytes in just weights.
To train it you need more memory than that, think I read it's a factor of 3*. So you need 6 terabytes of memory.
This would be either ~75 A100 80 GB GPUs, or I dunno how you do it with a cerebras. Presumably you need 150 of them.
Sure it might train the whole model in hours though, cerebras has the advantage of being much faster.
Speed matters, once AI wars get really serious this might be worth every penny.
nickmaran t1_j2e09em wrote
Let me get some pocket change from my Swiss account
quettil t1_j2dn46n wrote
A fraction of the resources wasted by cryptocurrency.
PeakFuckingValue t1_j2ct76l wrote
How big is the storage requirement though? I don't know if I have an accurate perspective on what's beyond terabytes. That's like describing light years to me. Good luck.
It seems like we already unlocked some incredible speed technology recently with quantum computers. That was many magnitudes beyond the standard deviation. Whatever the cutting edge is on quantum computing and AI research must be combining the two.
Yes it's all crazy to us as consumers, but don't worry. We're in a capitalistic world. Whoever brings it to consumers first gets all the money lmao. They will be so stupid rich as well. I wonder if the people who should work with AI will be the ones who get there first.
rslarson147 t1_j2cz1uh wrote
Wonder if I could use a few at work without anyone noticing
DaffyDogModa t1_j2czs09 wrote
Worth a try!
rslarson147 t1_j2czuh6 wrote
Developing a new hardware stress test
DaffyDogModa t1_j2czz2y wrote
Just need some kind of remote on/off switch that you can turn on when everyone has gone home for the day. But I bet next power bill somebody gonna notice lol
rslarson147 t1_j2d06gs wrote
It’s just one GPU compute cluster, how much power could it consume? 60W?
Coindiggs t1_j2ekopz wrote
One A100 needs about 200-250w each. This needs 584x250w = 146,000w so approximately 146kwH. Average price of power is like 0.3$/kwh right now so running this will cost ya 43.8$ per hour, 1051.2$ per day or 32,500ish USD per month.
DaffyDogModa t1_j2d09dx wrote
One GPU if it’s a bad boy can be hundreds of watts I think. Maybe a miner can chime in to confirm.
rslarson147 t1_j2d0cqe wrote
I actually work as a hardware engineer supporting GPU compute clusters and have access to quite a few servers but I’m sure someone in upper management wouldn’t approve of this use
XTJ7 t1_j2fhhjd wrote
This went right over the head of most people. Brilliant comment though.
[deleted] t1_j2cd40a wrote
[deleted]
[deleted] t1_j2chwuf wrote
[deleted]
littleMAS t1_j2e3smx wrote
Do you think Google can afford that?
tomistruth t1_j2eig9t wrote
A medium crypto farm has about 1000 highend gpus running for full year. Server costs will go down, but we will hit a CPU performance plateau soon. Still compared to tech just 20 years ago, we now have computers more powerful than desktop pcs in our smartwatches. Also the ai model probably won't run on our phones but be connected to a giant central server system via the internet.
But what happens once we give those personal ais acceess to our computers and data is terrifying.
It could be the end of free speech and democracy, because you could literally become transparent. The AI could predict your habits and needs and show you ads before you even realize you want that.
Scary thought.
go_comatose_for_me t1_j2c6afc wrote
The article made it seem that running the AI at home would be stupid due to hardware needs, but not completely out of reach. The new software does seems to be very, very reasonable for a University or Company doing research into AI to build and run.
EternalNY1 t1_j2cd4hm wrote
They still estimate $87,000 per year on the low end to operate it yearly on AWS for 175 billion parameters.
I am assuming that is just the cost to train it though so it would be a "one time" cost every time you decided to train it.
Not exactly cheap, but something can can be budgeted for larger companies.
I asked it specifically how many GPUs it uses, and it replied with:
>For example, the largest version of GPT-3, called "GPT-3 175B," is trained on hundreds of GPUs and requires several dozen GPUs for inference.
aquamarine271 t1_j2ckpo1 wrote
That’s it? Companies pay like at least 100k a year on shitty business intelligence server space that is hardly ever used.
wskyindjar t1_j2cly2m wrote
seriously. Chump change for any company that could benefit from it in any way
aquamarine271 t1_j2cmsny wrote
This guy should put a deck together on the source of this 87k/yr and make it public if he wants every mid sized+ company to be sold on the idea
[deleted] t1_j2dax93 wrote
[deleted]
[deleted] t1_j2ckqon wrote
[deleted]
Tiny_Arugula_5648 t1_j2d910s wrote
It costs much less and trains in a fraction of the time when you can use a TPU instead of a GPU on Google Cloud.. that’s how Google trained the BERT & T5 models..
JigglyWiener t1_j2cdmju wrote
What a solid article. Well written and no hype. Just the facts.
vysken t1_j2db226 wrote
Probably written by AI.
reconrose t1_j2e3u0l wrote
Nah because it'd repeat the same vague, indeterminate bullshit 15 times. I have yet to see any expository text from chatGPT that didn't sound like a 14 yr old trying to hit a word limit. Except in those "examples" where they actually edit the output or go "all I had to do was re-generate the output 20 times giving it small adjustments each time and now I have this mediocre paragraph! Way simpler than learning how to write".
Garland_Key t1_j2e7zag wrote
Most adults have a grade school reading level, so that sounds about right. In my experience ChatGPT creates things that are good enough. My lane is software engineering, so I outsource my writing to AI.
misconfigbackspace t1_j2eoh3t wrote
The part that really made the article worth reading was this:
> Like ChatGPT, PaLM + RLHF is essentially a statistical tool to predict words. When fed an enormous number of examples from training data — e.g., posts from Reddit, news articles and e-books — PaLM + RLHF learns how likely words are to occur based on patterns like the semantic context of surrounding text.
So, even when you ask it to create a completely new fictional mishmash story about Darth Vader landing his Death Star in Aragorn to save Thor from being assimilated by the Borg, it will spew out sensible sounding sentences because it knows those references and what comes before and after those words (e.g. Darth Vader, Aragorn, Thor, Borg) and how to link the "before" and "after" words to stitch up a story by combining the same / common "before" and "after" words of the others.
It gives an impression of really understanding what it is saying in some sense, possessing mental models of some sort. But it does not. And that is why it will at most be the next replacement of web search - the truly smart assistant.
But it is nowhere close to real intelligence of any kind because it has no model of reality.
It is great and useful and will it make money and result in productivity and economy? Absolutely, it will change computing services dramatically.
Is it intelligence? Nope. Not even close.
Garland_Key t1_j2etsc1 wrote
Well explained. Thank you!
almightySapling t1_j2f7a33 wrote
As long as AI continues to be trained on data from the internet, "average plus epsilon" is the best we can hope for.
Ensec t1_j2f4j5h wrote
It’s pretty good for explaining legal clauses
carazy81 t1_j2d3xjg wrote
$87k is a single persons wage. It’s absolutely worth running your own copy and training it with specific material. I jumped on this today and we’ll be running an implementation on azure with a team of two and as much hardware as reasonably required.
AI chat/assistance has been talked about for decades. ChatGpt is the first implementation I’ve used that I honestly think has “nailed it”.
alpacasb4llamas t1_j2fc1nn wrote
Gotta be able to find the right training material though and enough of it. I don't imagine many people have the resources or the ability to get that much raw data to get the model accurately trained.
carazy81 t1_j2fqs33 wrote
Yes, you’re right, but it depends on what you want it for. We have some specific applications, one of which is compliance checking. I suspect it will need a “base” of information to generate natural language and then a branch of data specific to the intended purpose. Honestly, I’m not sure, but either way, it’s worth investigating.
onyxengine t1_j2ciczh wrote
Its expensive, but it is feasible for an organization to raise the capital to deploy the resources. Its better than AI of this scale to be completely locked down as proprietary code.
extopico t1_j2bx06f wrote
This is a good article. Thank you for sharing it.
Vegetallica t1_j2djtwc wrote
Due to privacy reasons I haven't been able to play around with ChatGPT (OpenAI tracks IP addresses and requires a phone number to make an account). I would love to play around with one of these chat AIs when they can get the privacy thing sorted out.
serverpimp t1_j2d6ufu wrote
Can I borrow your AWS account?
popetorak t1_j2erzfp wrote
There's now an open source alternative to ChatGPT, but good luck running it
thats normal for open source
unua_nomo t1_j2e5rp7 wrote
I mean, honestly wouldn't be that hard to even crowd source training an open source model right?
misconfigbackspace t1_j2en6pf wrote
unua_nomo t1_j2enydh wrote
Crowdsource the funding, not the content the model is trained on
misconfigbackspace t1_j2erpp0 wrote
Funding one time's fairly easy. Getting a copy of that data is a little harder. That data will become stale in real time as the world moves forward, so that's the other big thing to keep in mind. I wonder what legal challenges will come up in the event the model copies stuff from litigious IP owners like Disney, the top music artists, Hollywood and the like.
unua_nomo t1_j2eyhnh wrote
I mean there are already open source datasets available, such as the Pile.
I can't see any argument for why a model derived on open source data would likewise not be open source, at which point if you could argue that a ML model could produce ip breaking content, that would be the responsibility of the individual producing and subsequently distributing that content.
As for data becoming stale, that wouldn't necessarily be an issue for plenty of applications, and even then there's no reason you couldn't just crowd fund 80k a year to train a newly updated model with newer content folded in.
misconfigbackspace t1_j2ez1sa wrote
> such as the Pile.
TIL. Thanks.
syfari t1_j2fekeo wrote
Challenges are already popping up from artists over diffusion models. A lot of this has already been settled though as courts have determined model training to fall under fair use.
the_bear_paw t1_j2crbqu wrote
Genuine question as I'm confused: I tried chatgpt the other day and it is free to use and just required a log in, and I could use it on my phone... What benefit is there to an open source version when the original version is free?
kraybaybay t1_j2csych wrote
Original won't be free for long, and there are many reasons to train the model on a different set of data.
ImposterSyndrome53 t1_j2cx57m wrote
I haven’t followed incredibly closely, so might be wrong but chatgpt uses their gpt-3 model and there is only free, non-commercial access to the model. So no other companies are able to leverage it in a service. This would enable others to use it in commercial means and profit from it.
Edit: I haven’t looked actually but open source doesn’t mean “able to be used commercially with no limitations “ either. There might be stipulations even on this new derivative one.
11fingerfreak t1_j2ee9ef wrote
You can feed this one your own training materials. That means you can teach it to “speak” the way you want it to. Hypothetically, you could feed it every text you’ve ever composed and it would eventually generate text that sounds like you instead of a combination of every random person from the internet or whatever authors they “borrowed” content from.
the_bear_paw t1_j2ehjwn wrote
Cool thanks for clarifying, this makes more sense now. I was thinking about this only from the consumers perspective and generally, open source just means free to filthy casuals like me, so I didn't understand why anyone cared since chatgpt is currently free.
Also, after posting I thought about it and asked chatgpt hypothetically how would a German civilian with 100,000 net worth effectively go about assassinating Vladimir Putin without getting caught and it gave me a lame answer about not being used to assist violent political acts, which I found kinda dumb. So I assume feeding it different information and setting different parameters on what the thing can reply to would be helpful.
11fingerfreak t1_j2em9ck wrote
There’s some drawbacks that make it challenging for us plebs to use it, of course. The amount of hardware needed for training isn’t something we’re likely to have at hands. Renting it from AWS appears to be around $87k / year. Though I guess we could just feed it text and wait the couple of years for it to be trained 😬
Still gonna try it. I’m used to waiting for R to finish its work so…
This is a big benefit to any organization that has a reasonable budget for using Azure or AWS, though.
EDIT: we can probably still make use of it despite the hardware demands. It just means it will take us longer to train as non-corporate entities.
peolorat t1_j2dcmta wrote
More specialization would be a benefit.
Qss t1_j2e87o2 wrote
OpenAI likely won’t leave it free forever, not to mention ChatGPT is severely restricted in its application, very much so a walled garden.
There are other open source projects, one that comes to mind is Stability AI, that are rumored to be developing a model that will run natively on your phone hardware, no web access required.
Open source will also allow people to train these models on more specific data sets, maybe focused around coding or essay writing or social media posting in particular, instead of a one size fits all solution.
OpenSource will also mean the tech can evolve at a breakneck pace, as the stable diffusion Text to image generator has shown - giving a wide open toolset to the general public results in explosive growth in tech compared to giving them the front end UI only.
It also democratizes the information. AI will monumentally shift our social and economic landscape, and leaving that power in the hands of an “elite few” will only serve to widen power gulfs and classist demarcations.
DaffyDogModa t1_j2c6m43 wrote
-Only- needs 584 GPUs for 3 months to train haha.