In non-smooth stochastic optimization, we establish the non-convergence of the stochastic subgradient descent (SGD) to the critical points recently called active strict saddles by Davis and Drusvyatskiy. Such points lie on a manifold $M$ where the function $f$ has a direction of second-order negative curvature. Off this manifold, the norm of the Clarke subdifferential of $f$ is lower-bounded. We require two conditions on $f$. The first assumption is a Verdier stratification condition, which is a refinement of the popular Whitney stratification. It allows us to establish a reinforced version of the projection formula of Bolte \emph{et.al.} for Whitney stratifiable functions, and which is of independent interest. The second assumption, termed the angle condition, allows to control the distance of the iterates to $M$. When $f$ is weakly convex, our assumptions are generic. Consequently, generically in the class of definable weakly convex functions, the SGD converges to a local minimizer.
翻译:在非摩擦的透视优化中,我们确定对戴维斯和德鲁斯维亚特斯基最近称为“主动严格马鞍”的临界点的随机亚梯下沉(SGD)不兼容。这些点位于一个元元美元上,其函数为二阶负曲线方向。在这个元上,克拉克次位标准为美元,其标准为较低限制。我们需要以美元计价的两个条件。第一个假设是Verdier分层状态,这是流行的惠特尼分层的改进。它使我们能够为惠特尼分层功能建立一个强化的Bolte\emphet{et.al.}预测公式版本,这是独立感兴趣的。第二个假设,称为角度条件,可以控制其距离为美元。当美元是弱方,我们假设是通用的。因此,在可确定性弱项功能的类别中,SGGD将一般地归结为本地最小值。