Submitted by mippie_moe t3_ym5b6h in MachineLearning

RTX 4090 vs RTX 3090 Deep Learning Benchmarks

Some RTX 4090 Highlights:

  • 24 GB memory, priced at $1599.
  • RTX 4090's Training throughput and Training throughput/$ are significantly higher than RTX 3090 across the deep learning models we tested, including use cases in vision, language, speech, and recommendation system.
  • RTX 4090's Training throughput/Watt is close to RTX 3090, despite its high 450W power consumption.
  • Multi-GPU training scales decently in our 2x GPU tests.
79

Comments

You must log in or register to comment.

Zer01123 t1_iv2lvf4 wrote

training throughput/$ seems off, in my opinion, by taking the official prices instead of the street prices:

​

  • the 3090 is around 1.1k € in Germany/Europe
  • the 4090 is around 2.3k € in Germany/Europe

​

With those numbers, I don't think the 4090 can beat the 3090 in price to performance.

The 4090 would need have double the performance to the 3090 to make it worth it.

​

But interesting to see the performance of multiple gpus scaling

42

suflaj t1_iv2z6fz wrote

Just a small correction: according to Geizhals RTX 4090 is 2030€, and 3090 is 1150€. So the 4090 would at this point need to be around 176% as powerful, but the prices of the 4090 will fall and of the 3090 will rise, so the comparison on the MSRP makes sense, since that is more stable than street prices.

5

bellyflop111 t1_iv40ead wrote

Don’t forget it consumes more power so the performance increase needs to account for that

4

suflaj t1_iv5iar6 wrote

That would be another metric, something like performance per watt per dollar, which is not included in the benchmark, and which is probably uninteresting to people, since then cards like 3060 would come out ahead.

−1

zaphdingbatman t1_iv5ok40 wrote

So long as MSRP continues to be more fantasy than reality, I am not interested in seeing it in comparisons. I will become interested only after it becomes reality again. It might be a while.

4

suflaj t1_iv5prfi wrote

That's on you tbh

I don't think it's very scientific to judge properties of a card based on the whim of an unregulated agent. That way we could conclude that ancient cards, which are basically worthless, are the best.

But other than that I don't think there's a single person who would recommend anything other than a 3090 even before these benchmarks. Same situation as 1080Ti vs 2080Ti, the 3090 is just too good of a card.

0

killver t1_iv2t4ll wrote

Yeah, not sure where they get this conclusion from.

3

nmkd t1_iv3miyj wrote

4090 starts at 1950€ here in Germany

3

chatterbox272 t1_iv3tw2v wrote

Street prices vary over time and location. For example, I have zero issues getting a 4090 at RRP where I am in Australia. Using RRP for comparisons makes the comparisons more universal and evergreen.

If you're considering a new GPU you'll need to know the state of your local market, so you can take the findings from Lambda, apply some knowledge about your local market (e.g. 4090s are actually 1.5x RRP right now where you are, or whatever), and then you can redraw your own conclusions. Alternatively, if they were using "street prices" I'd also have to know the state of their local market (wherever that happens to be) at the time of writing (whenever that happens to be), then work out the conversion from their market to mine.

2

chuanli11 t1_ivhlelj wrote

We used the recommended retail price from NVIDIA but OMG they are expensive on the street Lol

1

learn-deeply t1_iv3tuol wrote

Thanks for creating the benchmark!

FYI these results aren't exactly accurate because CUDA 12 supporting Hopper architecture isn't out yet, so none of the fp8 cores are being used and its not taking advantage of optimizations specific to Hopper. From the Nvidia whitepaper:

> With the new FP8 format, the GeForce RTX 4090 delivers 1.3 PetaFLOPS of performance for AI inference workloads.

CUDA 12 will be released some time in 2023, whenever they start delivering H100 GPUs, and it'll take some time for frameworks to add support.

Also the multi-GPU test is lacking some details that would be really helpful to know - how many lanes of PCIe4 is each GPU using? Is the test doing model parallel or data parallel?

12

Flag_Red t1_iv4h3kz wrote

I'm super hyped for fp8 support in CUDA. Combined with some other techniques it could put LLM inference (GPT-175B, for example) in reach of consumer hardware.

7

whata_wonderful_day t1_iv57znb wrote

Performance will definitely get better as time goes, but fp8 is going to be extra work to use, just like fp16.

5

chuanli11 t1_ivhkx9p wrote

Hey, Thanks for the comment. We made sure each GPU uses x16 PCIe 4.0 lanes. It is data parallel (PyTorch DDP specifically).

We look forward to the FP8/CUDA 12 update too.

2

onyx-zero-software t1_iv57n5r wrote

> In summary, the GeForce RTX 4090 is a great card for deep learning, particularly for budget-conscious creators, students, and researchers.

Lol what?

11

husmen93 t1_iva9gsq wrote

>students

So as a student in Finland, I can choose between paying 6 months of my rent or buying an RTX 4090 xD

5

wen_mars t1_iv4savp wrote

> The reference prices for RTX 3090 and RTX 4090 are $1400 and $1599, respectively.

Use realistic prices and the results look very different.

> Depending on the model, its TF32 training throughput is between 1.3x to 1.9x higher than RTX 3090. > Similarly, RTX 4090's FP16 training throughput is between 1.3x to 1.8x higher than RTX 3090.

8

whata_wonderful_day t1_iv20f1u wrote

Awesome, much appreciate the detailed benchmarks! The dual GPU scaling in particular was of interest to me. I was wondering how the lack of nvlink would affect things.

BERT large benchmarks would also be great, if you could do them?

6

killver t1_iv2swqz wrote

Thanks for that - unfortunately it confirms that it is performing worse than many were hoping.

5

MrAcurite t1_iv3kj8y wrote

Between the less than advertised performance and the stories about literally melting the power cables, I think I'm gonna stick with my underclocked eBay 3090 for the time being.

2

MisterManuscript t1_iv460q1 wrote

That extra money spent on the RTX 4090 is gonna go to waste since you're not using the the built-in optical-flow accelerators for training; they're designed and optimized specifically, and only to efficiently compute optical-flow fields for games. Better off sticking to RTX 3090.

3

wen_mars t1_iv4s1ic wrote

The optical flow processors are only a small part of the chip. It has way higher tensor compute available to AI. The real weakness is the limited memory bandwidth.

5

Krokodeale t1_iv3cv96 wrote

I'm really curious where the 4080 16gb will sit, since it's supposed to be couple of hundreds more than the 3090

2

husmen93 t1_iv7v18k wrote

>I'm really curious where the 4080 16gb will sit, since it's supposed to be couple of hundreds more than the 3090

And the effect of a shorter memory bus on the new models in general, after all last gen cards were mostly memory bandwidth bottlenecked.

2

Krokodeale t1_iv7zv22 wrote

So the 3090ti over the 4080 would be a better choice anyway ?

3

husmen93 t1_iva8w7u wrote

I think that's very likely to be the case but real world results remain to be seen, NVIDIA might have done other optimizations after all.

I am interested in a more "budget friendly" comparison between the upcoming RTX 4070 and RTX 3080 12GB, same memory size and potentially similar pricing. But the 4070 has 30% more compute while the 3080 has 80% more bandwidth.

The numbers are from here: RTX 3080 12GB | RTX 4070

3

tysam_and_co t1_iv3qkvj wrote

Thanks for the comparisons. Multi-GPU is always an interesting one. Hopefully they get things ironed out, there's things on this architecture that I really do like a lot :)

1

Fleischhauf t1_iyam8rb wrote

how does the 3090 ti perform in comparison? is it more in the middle or rather similar to the 3090? Im trying to decide between a 4090 amd 3090 ti right now..

1

danielfm123 t1_iv22wjz wrote

I just want to play games

−26

The-Protomolecule t1_iv2kznh wrote

You’re in the wrong subreddit.

15

danielfm123 t1_iv2m5zt wrote

true, but ml is not only about neural network.

−15

The-Protomolecule t1_iv2mond wrote

Ok. What does that have to do with your comment about gaming, and being in the wrong subreddit?

And it’s a DEEP learning benchmark, which is a heavy focus on neural networks.

10