Viewing a single comment thread. View all comments

utilop t1_ironow2 wrote

Why did this get downvoted?

Is there some fundamental limitation implying that we would have to rely on SGD and cannot do the optimization through superposition?

0

Less-Article1309 t1_irrjhfe wrote

There's plenty of other optimization methods out there, simulated annealing for example. SGD just lends itself well to the massively parallel architecture of Nvidia GPUs, that's the only reason why it's so prevalent in the industry.

1