Submitted by randomkolmogorov t3_zf25ue in MachineLearning
Hello!
I've been working on a project that uses the Natural Gradient and I was wondering if anyone has suggestions on ways to include higher-order information. Is there some equivalent to the Hessian for natural gradients? If not, is there some sort of way of finding a trust region where the Natural Gradient approximation is reasonable?
Thank you!
Ulfgardleo t1_iza5481 wrote
the hessian is always better information than the natural gradient because it includes actual information of the curvature of the function while the NG only includes curvature of the model. So any second order TR approach with NG information will approach the hessian.
//edit: I am assuming actual trust region methods, like TR-Newton, and not some RL-ML approximation schemes.