Oceanboi t1_iyq03g2 wrote
Reply to comment by Superschlenz in [D] In an optimal world, how would you wish variance between runs based on different random seeds was reported in papers? by optimized-adam
Why do you say an optimal learning algorithm should have zero hyperparameters? Are you saying an optimal neural network would learn things like batch size, learning rate, optimal optimizer (lol), input size, etc, on its own? In this case wouldn't a model with zero hyperparameters be the same conceptually as a model that has been tuned to the optimal hyperparameter combination?
Theoretically you could make these hyperparameters trainable if you had the coding chops, so why are we still as a community tweaking hyperparameters iteratively?
Superschlenz t1_iyq5oy5 wrote
>Why do you say an optimal learning algorithm should have zero hyperparameters?
Because hyperparameters are fixed by the developer, and so the developer must know the user's environment in order to tune them, but if it requires a developer then it is programming and not learning.
>Are you saying an optimal neural network would learn things like batch size, learning rate, optimal optimizer (lol), input size, etc, on its own?
An optimal learning algorithm wouldn't have those hyperparameters at all, not even static hardware.
>In this case wouldn't a model with zero hyperparameters be the same conceptually as a model that has been tuned to the optimal hyperparameter combination?
Users do not tune hyperparameters, and developers do not know the user's environment. The agent can be broadly pretrained at the developer's laboratory to speed up learning at the user's site, but finally it has to learn on its own at the user's site without a developer being around.
>Theoretically you could make these hyperparameters trainable if you had the coding chops, so why are we still as a community tweaking hyperparameters iteratively?
Because you as a community have been forced to decide for a job when you were 14 years old, and you chose to become a machine learning engineer because you were more talented than others, and now you are performing the show of the useful engineer.
Optimal-Asshole t1_iyqtp3l wrote
No, the reason for hyper parameter optimization isn’t job security. It’s because choosing better hyper parameters will produce better results which has more success in applications. There are people working on automatic hyperparameter optimization.
But let’s not act like it’s due solely due to some community caused phenomenon and engineers putting on a show. Honestly your message comes off as a little bitter.
Viewing a single comment thread. View all comments