高多元时间系列数据集建筑优化和特色学习 (Architectural Optimization and Feature Learning for High-Dimensional Time Series Datasets)

As our ability to sense increases, we are experiencing a transition from data-poor problems, in which the central issue is a lack of relevant data, to data-rich problems, in which the central issue is to identify a few relevant features in a sea of observations. Motivated by applications in gravitational-wave astrophysics, we study the problem of predicting the presence of transient noise artifacts in a gravitational wave detector from a rich collection of measurements from the detector and its environment. We argue that feature learning--in which relevant features are optimized from data--is critical to achieving high accuracy. We introduce models that reduce the error rate by over 60% compared to the previous state of the art, which used fixed, hand-crafted features. Feature learning is useful not only because it improves performance on prediction tasks; the results provide valuable information about patterns associated with phenomena of interest that would otherwise be undiscoverable. In our application, features found to be associated with transient noise provide diagnostic information about its origin and suggest mitigation strategies. Learning in high-dimensional settings is challenging. Through experiments with a variety of architectures, we identify two key factors in successful models: sparsity, for selecting relevant variables within the high-dimensional observations; and depth, which confers flexibility for handling complex interactions and robustness with respect to temporal variations. We illustrate their significance through systematic experiments on real detector data. Our results provide experimental corroboration of common assumptions in the machine-learning community and have direct applicability to improving our ability to sense gravitational waves, as well as to many other problem settings with similarly high-dimensional, noisy, or partly irrelevant data.

翻译：随着感知能力的提高,我们正经历从数据贫乏问题的适应性,其中核心问题是缺乏相关数据,转向数据富含问题,其中核心问题是确定观测海洋中的一些相关特征。受重力波天体物理学应用的驱动,我们研究的是从从从从探测器及其环境中收集的大量测量数据中预测是否存在瞬时噪声制品的问题。我们争辩说,学习的特征是,从数据中优化相关特征,而数据对于实现高准确性至关重要。我们引入的模型是,将误差率降低60%以上,而与以往使用固定手动特征的艺术状态相比。特性学习不仅因为能提高预测任务的性能而有用;结果提供了与引力波波波波探测器相关的模式的宝贵信息。在我们的应用中,发现许多与瞬间噪音有关的特性提供了其来源的诊断性信息,并提出了缓解战略。在高维度环境中学习的原理具有一定的难度。通过实验,我们通过高清晰度的实验,通过高清晰度的实验,我们通过高度的实验,在高度的实验模型中,我们可以辨别出其真实性、高度的变异性。