We provide a unified framework, applicable to a general family of convex losses and across binary and multiclass settings in the overparameterized regime, to approximately characterize the implicit bias of gradient descent in closed form. Specifically, we show that the implicit bias is approximated (but not exactly equal to) the minimum-norm interpolation in high dimensions, which arises from training on the squared loss. In contrast to prior work which was tailored to exponentially-tailed losses and used the intermediate support-vector-machine formulation, our framework directly builds on the primal-dual analysis of Ji and Telgarsky (2021), allowing us to provide new approximate equivalences for general convex losses through a novel sensitivity analysis. Our framework also recovers existing exact equivalence results for exponentially-tailed losses across binary and multiclass settings. Finally, we provide evidence for the tightness of our techniques, which we use to demonstrate the effect of certain loss functions designed for out-of-distribution problems on the closed-form solution.
翻译:我们提供了一个统一的框架,适用于在过度隔离制度下发生的螺旋损失和跨二等和多级环境,以大致说明封闭式梯度下降的隐含偏差。具体地说,我们表明,隐含偏差的近似(但并不完全等于)高度最低北纬内插,这来自对平方损失的培训。与以前针对指数性损失而专门设计并使用中间支持-矢量机配制的工作相比,我们的框架直接建立在对Ji和Telgarsky(2021年)的原始分析基础上,使我们能够通过新的敏感度分析为一般螺旋损失提供新的近似等值。我们的框架还恢复了二进制和多级环境的指数性损失的现有精确等值结果。最后,我们提供了证据,说明我们技术的紧凑性,我们用来证明为分配外问题设计的某些损失功能对封闭式解决办法的影响。</s>