Experimental sciences have come to depend heavily on our ability to organize, interpret and analyze high-dimensional datasets produced from observations of a large number of variables governed by natural processes. Natural laws, conservation principles, and dynamical structure introduce intricate inter-dependencies among these observed variables, which in turn yield geometric structure, with fewer degrees of freedom, on the dataset. We show how fine-scale features of this structure in data can be extracted from \emph{discrete} approximations to quantum mechanical processes given by data-driven graph Laplacians and localized wavepackets. This data-driven quantization procedure leads to a novel, yet natural uncertainty principle for data analysis induced by limited data. We illustrate the new approach with algorithms and several applications to real-world data, including the learning of patterns and anomalies in social distancing and mobility behavior during the COVID-19 pandemic.
翻译:实验科学已变得严重依赖我们组织、解释和分析从自然过程所支配的大量变量的观测中产生的高维数据集的能力。自然法、保护原则和动态结构在这些观察到的变量中引入了错综复杂的相互依存关系,这些变量反过来又在数据集上产生几何结构,自由度较低。我们展示了如何从数据驱动图Laplacians和局部波片对量子机械过程的近似中提取该结构在数据中的细微特征。这种数据驱动的量化程序为有限数据引起的数据分析带来了新的、但自然的不确定性原则。我们用算法和对现实世界数据的若干应用来说明新的方法,包括学习COVID-19大流行期间社会不和移动行为中的模式和异常现象。