ReginaldIII t1_is6e8rx wrote on October 13, 2022 at 5:07 PM

Reply to comment by mlaprise in [N] First RTX 4090 ML benchmarks by killver

Not really. There's still a lot of models being used in production written for the old TF graph API.

And if you've tested every prior GPU against that standard benchmark model for years you keep doing it so you can see what happens.

Edit: And as is this subs tradition for callous downvoting because your knee jerk reaction wasn't correct... Here's the relevant part of the article for you

> TensorFlow 1.15.5 ResNet50 > This is the NVIDIA maintained version 1 of TensorFlow which typically offers somewhat better performance than version 2. The benchmark is training 100 steps of the ResNet 50 layer convolution neural network (CNN). The result is the highest images-per-second value from the run steps. FP32 and FP16 (tensorcore) jobs were run.

It's a standard benchmark model! And it performs better that those written for TF2. What more do you want?

https://github.com/NVIDIA/DeepLearningExamples/tree/master/TensorFlow/Classification/ConvNets/resnet50v1.5

kajladk t1_is8gtdt wrote on October 14, 2022 at 1:39 AM

Can anyone explain how taking the max value from 100 runs is a good benchmark when for most other benchmarks (gaming fps etc) the average fps across multiple runs gives a more realistic performance and eliminates any outliers

afireohno t1_isblrse wrote on October 14, 2022 at 6:42 PM

>average fps across multiple runs gives a more realistic performance and eliminates any outliers

Thanks for the laugh. I'll just leave this here so you can read about why the mean (average) is not a robust measure of central tendency because it is easily skewed by outliers.

kajladk t1_isd009f wrote on October 15, 2022 at 12:43 AM

Umm, I know mean is more skewed by outliers than median, but it's still "better" than taking the best value

ReginaldIII t1_is8j7x8 wrote on October 14, 2022 at 1:57 AM

I would say there's precedent for lucky run benchmark scores. Consider 3dmark as an example.

https://benchmarks.ul.com/hall-of-fame-2/timespy+3dmark+score+performance+preset/version+1.0

All of those runs with different system configurations are peoples luckiest runs.

kajladk t1_is8lqr0 wrote on October 14, 2022 at 2:16 AM

But isn't this different? We are comparing raw metric (fps, images/sec) with an aggregate score which might already have ways to eliminate or regularize some outlier metrics in-built

ReginaldIII t1_isal2mt wrote on October 14, 2022 at 2:36 PM

Nvidia and Puget want to report lucky run. Lots of people do this. They're being fully transparent that they are reporting lucky runs. And it makes sense from their perspective to report their best theoretical performance.

It honestly just doesn't bother me to see them doing it because it's very normal and lots of people report this way. Even if we think an average with an error bar would be fairer.