Dendriform1491 t1_j5ywgiz wrote on January 26, 2023 at 3:08 PM

Also, Google doesn't use GPUs, they designed their own cards which they call TPUs.

TPUs are ASICs designed specifically for machine learning, they don't have any graphics related components, they are cheaper to make, use less energy and can make as many as they want.

cdsmith t1_j5z0rrm wrote on January 26, 2023 at 3:37 PM

You don't have to be Google to use special-purpose hardware for machine learning, either. I work for a company (Groq) that makes a machine learning acceleration chip available to anyone. Groq has competitors, like SambaNova and Cerebras, with different architectures.

Taenk t1_j60gdbl wrote on January 26, 2023 at 8:59 PM

Do these also increase inference speed? How much work is it to switch from CUDA based software to one of these?

cdsmith t1_j60q0bs wrote on January 26, 2023 at 9:59 PM

I can only answer about Groq. I'm not trying to sell you Groq hardware, honestly... I just honestly don't know the answers for other accelerator chips.

Groq very likely increases inference speed and power efficiency over GPUs; that's actually its main purpose. How much depends on the model, though. I'm not in marketing so I probably don't have the best resources here, but there are some general performance numbers (unfortunately no comparisons) in this article, and this one talks about a very specific case where a Groq chip gets you a 1000x inference performance advantage over the A100.

To run a model on a Groq chip, you would typically start before CUDA enters the picture at all, and convert from PyTorch, Tensorflow, or a model in several other common formats into a Groq program using https://github.com/groq/groqflow. If you have custom-written CUDA code, then it's likely you've got some programming work ahead of you to run on something besides a GPU.

lucidrage t1_j61so7l wrote on January 27, 2023 at 2:33 AM

>convert from PyTorch, Tensorflow, or a model in several other common formats into a Groq program

Are there any effort spend in adding a plugin for a high level framework like keras to automatically use groq?

cdsmith t1_j62a3yv wrote on January 27, 2023 at 4:55 AM

I'm not aware of any effort to build it into Keras, but Keras models are one of the things you can easily convert to Groq chips using groqflow.

gradientpenalty t1_j61tko2 wrote on January 27, 2023 at 2:40 AM

Okay, so where can I buy it as a small startup for under 10k without signing any NDA for using your proprietary compiler. As far as I can see, we are all still stuck with Nvidia after 10B of funding for all these "AI" hardware startup.

cdsmith t1_j626c0c wrote on January 27, 2023 at 4:21 AM

I honestly don't know the price or terms of use, for this or any other company. I'm not in sales or marketing at all. I said you don't need to be Google; obviously you have to have some amount of money, whether you're buying a GPU or some other piece of hardware.

CKtalon t1_j5y87e5 wrote on January 26, 2023 at 11:44 AM

People often quote Chinchilla about performance, claiming that there's still a lot of performance to be unlocked when we do not know how GPT 3.5 was trained. GPT 3.5 could very well be Chinchilla-optimal, even though the 1st version of davinci was not Chinchilla-optimal. We know that OpenAI has retrained GPT 3 due to the increased context length going from 2048 to 4096 to the apparent 8000ish tokens for ChatGPT.

manubfr t1_j5y8mo0 wrote on January 26, 2023 at 11:48 AM

You're right, it could be that 3.5 is already using that approach. I guess the emergent cognition tests haven't yet been published for GPT-3.5 (or have they?) so it's hard for us to measure performance as individuals. I guess someone could test text-davinci-003 on a bunch of cognitive tasks on the PlayGround but I'm far too lazy to do that :)

CKtalon t1_j5y9deu wrote on January 26, 2023 at 11:57 AM

There's also the rumor mill that Whisper was used to gather a bigger text corpus from videos to train GPT 4.

FallUpJV t1_j5ya6t5 wrote on January 26, 2023 at 12:06 PM

This is something that I often read, that other LLMs are undertrained, but how come the OpenAI one is the only one not to be ? Datasets ? Computing power ?

MysteryInc152 t1_j60vz8p wrote on January 26, 2023 at 10:38 PM

OpenAI's models are still undertrained as well.

Few questions about scalability of chatGPT [D]

manubfr t1_j5y6wko wrote on January 26, 2023 at 11:28 AM