Dimensionality reduction is a crucial first step for many unsupervised learning tasks including anomaly detection. Autoencoder is a popular mechanism to accomplish the goal of dimensionality reduction. In order to make dimensionality reduction effective for high-dimensional data embedding nonlinear low-dimensional manifold, it is understood that some sort of geodesic distance metric should be used to discriminate the data samples. Inspired by the success of neighborhood aware shortest path based geodesic approximatiors such as ISOMAP, in this work, we propose to use a minimum spanning tree (MST), a graph-based algorithm, to approximate the local neighborhood structure and generate structure-preserving distances among data points. We use this MST-based distance metric to replace the Euclidean distance metric in the embedding function of autoencoders and develop a new graph regularized autoencoder, which outperforms, over 20 benchmark anomaly detection datasets, the plain autoencoder using no regularizer as well as the autoencoders using the Euclidean-based regularizer. We furthermore incorporate the MST regularizer into two generative adversarial networks and find that using the MST regularizer improves the performance of anomaly detection substantially for both generative adversarial networks.
翻译:降低尺寸是许多不受监督的学习任务的关键第一步, 包括异常检测。 自动编码是完成维度降低目标的流行机制。 为了让高维数据嵌入非线性低维元体的高度数据有效降低维度, 人们理解, 某些测地距离测量仪应该用来区分数据样本。 由社区认识最短路径的地深相对齐仪的成功所激励, 比如 ISOMAP, 在这项工作中, 我们提议使用一个最小覆盖树( MST), 一种基于图形的算法, 以近似本地邻居结构, 并产生数据点之间的结构保持距离。 我们使用基于 MST 的距离测量仪来取代自动编码者嵌入功能中的 Euclidean 距离测量仪, 并开发一个新的图形正统化的自动解析器, 超过20多个基准异常检测数据集, 使用无正度检测器的普通自动解析器, 以及使用基于 Euclidean 的正统算器的自动解算器, 我们还将Mical- ASyal ASyal ASyal ASyal ASyberal ASyer 网络纳入两个正常的自动检测网络。