As data grows in size and complexity, finding frameworks which aid in interpretation and analysis has become critical. This is particularly true when data comes from complex systems where extensive structure is available, but must be drawn from peripheral sources. In this paper we argue that in such situations, sheaves can provide a natural framework to analyze how well a statistical model fits at the local level (that is, on subsets of related datapoints) vs the global level (on all the data). The sheaf-based approach that we propose is suitably general enough to be useful in a range of applications, from analyzing sensor networks to understanding the feature space of a deep learning model.
翻译:随着数据规模和复杂性的增加,寻找有助于解释和分析的框架变得至关重要。当数据来自结构广泛、但必须来自外围来源的复杂系统时,情况尤其如此。在本文中,我们争辩说,在这种情况下,包子可以提供一个自然框架,分析统计模型在地方一级(即相关数据点子集)与全球一级(所有数据)的相适应性。我们提议的草包法非常笼统,足以用于一系列应用,从分析传感器网络到了解深层学习模型的特征空间。