We study the acceleration of the Local Polynomial Interpolation-based Gradient Descent method (LPI-GD) recently proposed for the approximate solution of empirical risk minimization problems (ERM). We focus on loss functions that are strongly convex and smooth with condition number $\sigma$. We additionally assume the loss function is $\eta$-H\"older continuous with respect to the data. The oracle complexity of LPI-GD is $\tilde{O}\left(\sigma m^d \log(1/\varepsilon)\right)$ for a desired accuracy $\varepsilon$, where $d$ is the dimension of the parameter space, and $m$ is the cardinality of an approximation grid. The factor $m^d$ can be shown to scale as $O((1/\varepsilon)^{d/2\eta})$. LPI-GD has been shown to have better oracle complexity than gradient descent (GD) and stochastic gradient descent (SGD) for certain parameter regimes. We propose two accelerated methods for the ERM problem based on LPI-GD and show an oracle complexity of $\tilde{O}\left(\sqrt{\sigma} m^d \log(1/\varepsilon)\right)$. Moreover, we provide the first empirical study on local polynomial interpolation-based gradient methods and corroborate that LPI-GD has better performance than GD and SGD in some scenarios, and the proposed methods achieve acceleration.
翻译:我们最近为实验风险最小化问题(ERM)的近似解决方案提出了基于本地聚合内插的梯度法(LPI-GD)的加速性研究。我们侧重于以条件号$\sgma$为基数的强烈共振和顺畅的损失函数。我们还假设,数据的损失函数是$(eta$-H\”older 连续的。LPI-GD的异常复杂性是$\tilde{O ⁇ left(sigma m ⁇ d\log(1/\varepsilon)\right)$(美元),以达到理想的精确度$\varepsilon,美元是参数空间的维度,而美元是近似基数网的基度。 系数$%d$(1/\\\\'oldolder)可以显示为美元(1/\\\\\\\\\etta}。 LPI-GD-G(GG)和Stochastel 梯系(SG)的某些参数系统,我们建议了两种加速方法,根据IMLPI-ral-ral_GDLGDLDLDSLSLS和IMLDLSLSLSLSLSLSLS-I和IMLS-ILS-ILS-ILBLBLBLS 和IMLS)的精度, 的加速性方法显示了两种方法。