The underspecification of most machine learning pipelines means that we cannot rely solely on validation performance to assess the robustness of deep learning systems to naturally occurring distribution shifts. Instead, making sure that a neural network can generalize across a large number of different situations requires to understand the specific way in which it solves a task. In this work, we propose to study this problem from a geometric perspective with the aim to understand two key characteristics of neural network solutions in underspecified settings: how is the geometry of the learned function related to the data representation? And, are deep networks always biased towards simpler solutions, as conjectured in recent literature? We show that the way neural networks handle the underspecification of these problems is highly dependent on the data representation, affecting both the geometry and the complexity of the learned predictors. Our results highlight that understanding the architectural inductive bias in deep learning is fundamental to address the fairness, robustness, and generalization of these systems.
翻译:多数机器学习管道的细化意味着我们不能仅仅依靠验证性能来评估深层次学习系统的稳健性来评估自然发生的分布变化。 相反,确保神经网络能够将大量不同情况广泛分布,这要求它了解解决任务的具体方式。 在这项工作中,我们提议从几何角度研究这一问题,目的是了解特定环境中神经网络解决方案的两个关键特征:在数据代表中,所学功能的几何性如何?此外,深层次网络总是偏向于更简单的解决方案,正如最近的文献所预测的那样?我们表明,神经网络处理这些问题的细化方式高度取决于数据表述方式,既影响几何方法,也影响所学预测器的复杂性。我们的结果突出表明,理解深层次学习中的结构性偏差对于解决这些系统的公平性、稳健性和概括性至关重要。