Viewing a single comment thread. View all comments

timshel42 t1_j8yx6wb wrote

couldnt people into the opensource thing still host it on their own powerful servers and allow others to use it?

15

IonizingKoala t1_j8z0znz wrote

Of course "regular" people will be able to use it, the same way regular people get access to state of the art quantum computers and supercomputers.

What TunaFish is saying is unlikely is for everyone to be able to run it in their own home. LLM engineers concur, moore's law isn't quite there anymore.

If you mean server time, that's obviously possible (I can run loads of GPT-3 right now for $5). But that's not exactly running it at home, if you know what I mean.

7

Soft-Goose-8793 t1_j90cxmk wrote

Could a LLM be run like torrents or bitcoin or TOR is? We could have LLM miners or something.

A small company could rent server time in some country with lax laws, to run an unlobotomised version of a LLM from, and people could subscribe to that service instead of dealing with microsoft or openai.

4

IonizingKoala t1_j91lzfv wrote

The thing is that in LLM training, memory and IO bandwidth are the big bottlenecks. If every GPU has to communicate via the internet, and wait for the previous person to be done first (because pipelined model parallel is still sequential, despite the name), it's gonna finish in like 100 years. Another slowdown is breaking up each layer into pieces that individual GPUs can handle. Currently they're being spread out to 2000-3000 huge GPUs and there's already significant latency. What happens if there's 20,000 small-sized GPUs? Each layer is gonna be spread out so thin the latency is gonna be enormous. The final nail in the coffin is that neural network architecture changes a lot, and each time the hardware has to be reconfigured too.

Crypto mining didn't have these problems because 1. bandwidth was important, but not the big bottleneck, 2. "layers" could fit on single GPUs, and if they couldn't (on a 1050ti for example), it was very slow, and 3. the architecture didn't really change, you just did the same thing over and over.

Cerebras is trying to make a huge chip that disaggregates memory from compute, and also bundles compute into a single chip, saving energy and time. The cost for the CS-2 system is around $3-10 million for the hardware alone. It's pretty easy for a medium-sized startup to offer some custom LLM. I mean there's already dozens, if not hundreds of startups starting to do that right now. It's expensive. All complex computing is expensive, we can't really get around that, we can only slowly make improvements.

4

Deadboy00 t1_j91v5cp wrote

⭐️ Refreshing to see someone who knows their shit on this sub. Where do you see this tech going for general use cases? Everything I read tells me it just isn’t ready. What is MS’s endgame for implementing all this?

2

IonizingKoala t1_j927ast wrote

Classical computing / engineering advances are good at repetitive actions. A human can never put in a screw 10,000x times with 0.01mm precision or calculate 5000 graphs by hand without quitting. But it's bad at actions that require flexibility and adaptation, like what chefs, dry cleaners, or software engineers do.

LLM and AI attempt to bridge that gap, by allowing for computers to be flexible and adapt. The issue is that we don't know how much they're actually capable of adapting, and how fast. We know humans have a limit; nobody in the world fluently speaks & reads & writes in more than 10 languages (probably not even >5). Do computers have a limit? How expensive is that limit? Because materials, manufacturing, and energy are finite resources.

What do you define as general use cases? Receptionist calls? (already done, one actually fooled me into thinking it was a human) Making a cup of coffee?

Anything repetitive will be automated, if it's economical to do so. You probably still make tea by hand, because it's a waste of money to buy a $100 tea maker (and they probably dont even exist because of how easy it is to make tea). But you probably have a blender, because it's a huge waste of time and energy to chop stuff yourself.

I think humans (on this subreddit especially) tend to underestimate how much finances & logistics play into tech. We've had flying cars since the 90s, yet they'll never "transform transportation" like sci-fi said, because it's dumb to have a car-plane hybrid.

We might get an impressive AGI in the next few years, but it might be so expensive that it's just used the same way we use robots: you get the cutting-edge stuff you'll never see cause it's in some factory, the entertaining stuff like the cruise ship robo-bartenders, and the consumer-grade crap like Roombas. AGI might also kill millions of humans but I know nothing about that side of AI so I won't comment.

Btw, I'm not an expert, I'm just a software engineer that likes talking to AI engineers.

2

Deadboy00 t1_j929dnb wrote

Dig it. I have a similar background and have had conversations with interns at ai firms like Palantir that have been doing the shit you described for years. I agree. It’s too expensive to train ai’s for every specific use case. That’s what I meant by “general”.

I think the most fascinating part of this current trend is seeing the general populations reaction to these tools being publicly released. And that’s what’s at the heart of my question…if the tech is unreliable, expensive, and generally not scalable …why is MS doing this?

I mean obviously they are generating data on user interactions to retrain the model but I can’t imagine that being the silver bullet.

Google implemented plenty of ai tech in their search engine but nobody raises an eyebrow, but now all this? I’m rambling at this point but it’s just not adding up in my brain ¯_(ツ)_/¯

2

IonizingKoala t1_j92caso wrote

Microsoft is similar to Google; both like to experiment and make cool stuff, but Microsoft doesn't cut the fat and likes to put out products which are effectively trash under the guise of open beta. Heck, even their hardware is sometimes like that, while Google's products are typically solid, even if they have a short lifespan.

Going back to New Bing, it's genuinely innovative. It just sucks. That's not paradoxical, because a lot of new stuff does suck. We just rarely see it, because companies like Google are generally disciplined enough.

Most "deep" innovations are developed over decades. That development could be secretive (military tech), or open (SpaceX, Tesla), but it takes time nonetheless. Microsoft leans towards the latter, Google the former.

The latter is generally more efficient, if your audience is results-focused, not emotions-focused. AI is pretty emotionally charged, so maybe the former method is better.

2

Deadboy00 t1_j92j3s2 wrote

That’s a good take. I think Google’s discipline is rooted in its size and prominence. There’s too much to lose. MS on the other hand wants to desperately be the king of the hill again.

2

IonizingKoala t1_j92nqhq wrote

The funny thing is though, Microsoft has a market cap 58% larger than Alphabet, not just Google. We're left wondering why Microsoft continually takes these weird risks in the consumer space when they can just play it safe like most other big players. None of their (21st century) success has been due to quirky disruptions, it's usually been slow and steady progress (Surface, Office, Enterprise, Cloud, Consulting).

Yet with stuff like Edge, Windows 11, etc, it's been a mess. I'm not 12 anymore, I prefer stable products over the shinest new thing, and Windows 11 has been a collosal disappointment.

1

duboispourlhiver t1_j90jyl8 wrote

True. Progress in AI is even more impressive than Moores law was, so maybe it will run at home because of progress on LLM and not progress on microelectronics

1

IonizingKoala t1_j91jdx7 wrote

LLMs will not be getting smaller. Getting better ≠ getting smaller.

Now, will really small models be run on some RTX 6090 ti in the future? Probably. Think GPT-2. But none of the actually useful models (X-Large, XXL, 10XL, etc) will be accessible at home.

1

duboispourlhiver t1_j91k8jk wrote

I disagree

1

IonizingKoala t1_j91m923 wrote

Which part? LLM-capable hardware getting really really cheap, or useful LLMs not growing hugely in parameter size?

1

duboispourlhiver t1_j91x4ao wrote

I meant that IMHO, gpt3 level LLMs will have fewer parameters in the future.

2

IonizingKoala t1_j924sbn wrote

I see. Even at a 5x reduction in parameter size, that's still not enough to run on consumer hardware (we're talking 10b vs. 500m) , but I recognize what you're trying to say.

2

freeman_joe t1_j90pioh wrote

We have access to quantum computers already we call them human brains. We can see nature solved that it is only matter of time when we do the same with tech and it will be available for home usage.

1