oh__boy
oh__boy t1_j6h7zc2 wrote
Reply to comment by tysam_and_co in [R] Train CIFAR10 in under 10 seconds on an A100 (new world record!) by tysam_and_co
Do you have a dedicated development set apart from your test set to tune these hyperparameters? Or am I missing the point that this is not meant to be a general improvement but rather to see just how fast you can train with on this single dataset.
oh__boy t1_j6m3sah wrote
Reply to comment by tysam_and_co in [R] Train CIFAR10 in under 10 seconds on an A100 (new world record!) by tysam_and_co
Interesting, thanks for the detailed answer. This is cool work, I also love to work on projects which squeeze out every last ounce of performance possible to solve a problem. I am somewhat skeptical of how much this applies to other architectures / datasets / problems, since you seem to only have worked on one network and one dataset. I hope you try to find general concepts and show that they apply to more than just that network and dataset and prove me wrong though. Good luck with everything!