自动编码的地形障碍 (Topological Obstructions to Autoencoding)

Autoencoders have been proposed as a powerful tool for model-independent anomaly detection in high-energy physics. The operating principle is that events which do not belong to the space of training data will be reconstructed poorly, thus flagging them as anomalies. We point out that in a variety of examples of interest, the connection between large reconstruction error and anomalies is not so clear. In particular, for data sets with nontrivial topology, there will always be points that erroneously seem anomalous due to global issues. Conversely, neural networks typically have an inductive bias or prior to locally interpolate such that undersampled or rare events may be reconstructed with small error, despite actually being the desired anomalies. Taken together, these facts are in tension with the simple picture of the autoencoder as an anomaly detector. Using a series of illustrative low-dimensional examples, we show explicitly how the intrinsic and extrinsic topology of the dataset affects the behavior of an autoencoder and how this topology is manifested in the latent space representation during training. We ground this analysis in the discussion of a mock "bump hunt" in which the autoencoder fails to identify an anomalous "signal" for reasons tied to the intrinsic topology of $n$-particle phase space.

翻译：在高能物理学中,提出了自动编码器,作为在高能物理中进行模型独立的异常探测的强大工具。操作原则是,不属于培训数据空间的事件将重建得不好,因此将它们标记为异常。我们指出,在各种感兴趣的例子中,大型重建错误和异常之间的联系并不十分明确。特别是,对于具有非三维地形学的数据集来说,由于全球性问题,总是会出现错误地看似异常的点。相反,神经网络通常具有感应偏差,或者在地方内插之前,因此,尽管实际存在所希望的异常现象,但不属于培训数据空间空间空间空间空间空间空间空间空间空间空间空间空间空间空间空间空间空间空间空间空间空间空间空间空间空间空间空间网络可能会发生微小的错误,而这种表层学如何表现。我们把这些分析放在模拟的“顶层搜索”的“顶层图像”中进行模拟的“顶层搜索”中。

相关内容

自编码器

关注 140

自动编码器是一种人工神经网络，用于以无监督的方式学习有效的数据编码。自动编码器的目的是通过训练网络忽略信号“噪声”来学习一组数据的表示（编码），通常用于降维。与简化方面一起，学习了重构方面，在此，自动编码器尝试从简化编码中生成尽可能接近其原始输入的表示形式，从而得到其名称。基本模型存在几种变体，其目的是迫使学习的输入表示形式具有有用的属性。自动编码器可有效地解决许多应用问题，从面部识别到获取单词的语义。

【CVPR2020】视觉导航的神经拓扑SLAM，56页ppt，Neural Topological SLAM for Visual Navigation

专知会员服务

14+阅读 · 2020年6月18日

【CVPR2020】视觉导航的神经拓扑SLAM，Neural Topological SLAM for Visual Navigation

专知会员服务

51+阅读 · 2020年5月26日

【CVPR2020-台大】透视眼：学会透过障碍物看东西，Learning to See Through Obstructions

专知会员服务

27+阅读 · 2020年4月3日

【PUC-牛津-ICLR2020】图神经网络的逻辑表达性，The Logical Expressiveness of GNN

专知会员服务

29+阅读 · 2020年3月15日