Exponential generalization bounds with near-tight rates have recently been established for uniformly stable learning algorithms. The notion of uniform stability, however, is stringent in the sense that it is invariant to the data-generating distribution. Under the weaker and distribution dependent notions of stability such as hypothesis stability and $L_2$-stability, the literature suggests that only polynomial generalization bounds are possible in general cases. The present paper addresses this long standing tension between these two regimes of results and makes progress towards relaxing it inside a classic framework of confidence-boosting. To this end, we first establish an in-expectation first moment generalization error bound for potentially randomized learning algorithms with $L_2$-stability, based on which we then show that a properly designed subbagging process leads to near-tight exponential generalization bounds over the randomness of both data and algorithm. We further substantialize these generic results to stochastic gradient descent (SGD) to derive improved high-probability generalization bounds for convex or non-convex optimization problems with natural time decaying learning rates, which have not been possible to prove with the existing hypothesis stability or uniform stability based results.
翻译:最近,为统一稳定的学习算法,为统一稳定的学习算法,建立了具有近近紧速率的指数性通用界限。但是,统一稳定的概念是严格的,因为它与数据生成分布不相容。在较弱和分布依赖的稳定概念下,如假设稳定性和$L_2美元稳定,文献表明一般情况下只有多数值性通用界限是可能的。本文件讨论这两种结果制度之间的长期紧张,并在一个典型的增强信心框架内逐步放松这种紧张。为此,我们首先为可能随机的学习算法设定了一个在第一时刻的通用错误,该错误与可能随机的以美元为单位的2美元稳定分布有关。我们随后根据这个概念表明,一个设计得当的分计过程导致接近于数据和算法随机性的指数性通用界限。我们进一步将这些一般性结果转化为偏差性梯度梯度梯度下降(SGD),以获得更好的高概率通用约束。我们首先为Convex或非convex优化的周期性错误,但以自然稳定性稳定率为现有稳定性假设,但证明这种结果并非稳定或稳定性不变。