In stochastic optimization, a common tool to deal sequentially with large sample is to consider the well-known stochastic gradient algorithm. Nevertheless, since the stepsequence is the same for each direction, this can lead to bad results in practice in case of ill-conditionned problem. To overcome this, adaptive gradient algorithms such that Adagrad or Stochastic Newton algorithms should be prefered. This paper is devoted to the non asymptotic analyis of these adaptive gradient algorithms for strongly convex objective. All the theoretical results will be adapted to linear regression and regularized generalized linear model for both Adagrad and Stochastic Newton algorithms.
翻译:在随机优化中,一个与大样本相继处理的共同工具是考虑众所周知的随机梯度算法。 然而,由于每个方向的序列顺序相同,因此在出现问题条件不当的情况下,这可能会在实践中导致不良结果。 要克服这一点,应选择适应性梯度算法,如Adagrad 或Stochastic Newton 算法。本文专门论述这些适应性梯度算法的非非实时解析法,以达到强烈共振目标。所有理论结果都将适应Adagrad 和Stochastic Newton 算法的线性回归和常规化通用直线模式。</s>