Stochastic approximation (SA) is a classical algorithm that has had since the early days a huge impact on signal processing, and nowadays on machine learning, due to the necessity to deal with a large amount of data observed with uncertainties. An exemplar special case of SA pertains to the popular stochastic (sub)gradient algorithm which is the working horse behind many important applications. A lesser-known fact is that the SA scheme also extends to non-stochastic-gradient algorithms such as compressed stochastic gradient, stochastic expectation-maximization, and a number of reinforcement learning algorithms. The aim of this article is to overview and introduce the non-stochastic-gradient perspectives of SA to the signal processing and machine learning audiences through presenting a design guideline of SA algorithms backed by theories. Our central theme is to propose a general framework that unifies existing theories of SA, including its non-asymptotic and asymptotic convergence results, and demonstrate their applications on popular non-stochastic-gradient algorithms. We build our analysis framework based on classes of Lyapunov functions that satisfy a variety of mild conditions. We draw connections between non-stochastic-gradient algorithms and scenarios when the Lyapunov function is smooth, convex, or strongly convex. Using the said framework, we illustrate the convergence properties of the non-stochastic-gradient algorithms using concrete examples. Extensions to the emerging variance reduction techniques for improved sample complexity will also be discussed.
翻译:沙粒近似(SA) 是一种古典算法,它从早期起就对信号处理和现在的机器学习产生巨大影响,因为需要处理大量不确定数据。SA的一个特例涉及流行的随机(子)梯度算法,这是许多重要应用背后的灵马。一个较不为人知的事实是,SA 办法还扩展至非随机渐变算法,如压缩随机梯度梯度、随机预期-最大度和一系列强化学习算法。这篇文章的目的是通过提出由理论支持的精度(子)梯度算法设计指南,将南非的非随机梯度梯度视角引入信号处理和机器学习受众。我们的核心主题是提出一个总框架,统一现有的南非理论,包括非随机和微度的趋同结果,并展示其在流行非随机精度变异性分析算法中的应用。我们根据不直度的精度运算法分析框架向信号处理和机组学习受众介绍非随机精度的观点,在使用低度变整变整的变整变法函数中,我们用不相变整的变整变变变变变变变变变变的变变的变变变的变变法函数, 。