A fundamental quest in the theory of deep-learning is to understand the properties of the trajectories in the weight space that a learning algorithm takes. One such property that had very recently been isolated is that of "local elasticity" ($S_{\rm rel}$), which quantifies the propagation of influence of a sampled data point on the prediction at another data point. In this work, we perform a comprehensive study of local elasticity by providing new theoretical insights and more careful empirical evidence of this property in a variety of settings. Firstly, specific to the classification setting, we suggest a new definition of the original idea of $S_{\rm rel}$. Via experiments on state-of-the-art neural networks training on SVHN, CIFAR-10 and CIFAR-100 we demonstrate how our new $S_{\rm rel}$ detects the property of the weight updates preferring to make changes in predictions within the same class of the sampled data. Next, we demonstrate via examples of neural nets doing regression that the original $S_{\rm rel}$ reveals a $2-$phase behaviour: that their training proceeds via an initial elastic phase when $S_{\rm rel}$ changes rapidly and an eventual inelastic phase when $S_{\rm rel}$ remains large. Lastly, we give multiple examples of learning via gradient flows for which one can get a closed-form expression of the original $S_{\rm rel}$ function. By studying the plots of these derived formulas we given a theoretical demonstration of some of the experimentally detected properties of $S_{\rm rel}$ in the regression setting.
翻译:深层学习理论的一项根本探索是理解学习算法在重量空间中学习算法的轨迹属性。 最近被孤立的属性之一是“本地弹性” (S ⁇ rm rel}$),它量化了一个抽样数据点对另一个数据点预测的影响的传播。在这项工作中,我们通过提供新的理论见解和更仔细的经验证据来全面研究本地弹性。首先,具体到分类设置,我们建议对美元-rm rel}的原始概念作出新的定义。在SVHN、CIFAR-10和CIFAR-100上“本地弹性”的状态神经网络培训实验,我们展示了我们新的 $srm Rel} 是如何检测到重量更新的属性的。在一系列原始数据中,我们通过演示网进行一些回归,我们最初的 $rmrel=rel=l=rel=rel=l=lational $r_r_r_r_r_r_l_rx_l_l_l_l_r_r_r_r_r_____r_r_r__r____r_r_r__r__r___r_r_r_r_r___r______r_______r_______r___r___________r________________________