For large nonlinear least squares loss functions in machine learning we exploit the property that the number of model parameters typically exceeds the data in one batch. This implies a low-rank structure in the Hessian of the loss, which enables effective means to compute search directions. Using this property, we develop two algorithms that estimate Jacobian matrices and perform well when compared to state-of-the-art methods.
翻译:对于机器学习中的大型非线性最低平方损失函数,我们利用模型参数数量通常超过一组数据的财产。这意味着损失的赫西安语结构低,从而能够有效计算搜索方向。我们利用这一属性,开发了两种算法,对雅各布基体进行估计,并与最新方法相比,表现良好。