We approach the problem of implicit regularization in deep learning from a geometrical viewpoint. We highlight a regularization effect induced by a dynamical alignment of the neural tangent features introduced by Jacot et al, along a small number of task-relevant directions. This can be interpreted as a combined mechanism of feature selection and compression. By extrapolating a new analysis of Rademacher complexity bounds for linear models, we motivate and study a heuristic complexity measure that captures this phenomenon, in terms of sequences of tangent kernel classes along optimization paths.
翻译:我们从几何角度的深层次学习中处理隐含的正规化问题。我们强调Jacot等人引入的神经相近特征动态结合所带来的正规化效应,以及少数与任务相关的方向。这可以被解释为特征选择和压缩的综合机制。通过对射线模型的雷德马赫复杂界限进行新的分析,我们激励并研究一种超常复杂性测量方法,从优化路径上相近的内核层的顺序上捕捉这一现象。