We consider extensions of the Newton-MR algorithm for nonconvex optimization to the settings where Hessian information is approximated. Under additive noise model on the Hessian matrix, we investigate the iteration and operation complexities of these variants to achieve first and second-order sub-optimality criteria. We show that, under certain conditions, the algorithms achieve iteration and operation complexities that match those of the exact variant. Focusing on the particular nonconvex problems satisfying Polyak-\L ojasiewicz condition, we show that our algorithm achieves a linear convergence rate. We finally compare the performance of our algorithms with several alternatives on a few machine learning problems.
翻译:暂无翻译