Dimensionality reduction is considered as an important step for ensuring competitive performance in unsupervised learning such as anomaly detection. Non-negative matrix factorization (NMF) is a popular and widely used method to accomplish this goal. But NMF do not have the provision to include the neighborhood structure information and, as a result, may fail to provide satisfactory performance in presence of nonlinear manifold structure. To address that shortcoming, we propose to consider and incorporate the neighborhood structural similarity information within the NMF framework by modeling the data through a minimum spanning tree. We label the resulting method as the neighborhood structure assisted NMF. We further devise both offline and online algorithmic versions of the proposed method. Empirical comparisons using twenty benchmark datasets as well as an industrial dataset extracted from a hydropower plant demonstrate the superiority of the neighborhood structure assisted NMF and support our claim of merit. Looking closer into the formulation and properties of the neighborhood structure assisted NMF with other recent, enhanced versions of NMF reveals that inclusion of the neighborhood structure information using MST plays a key role in attaining the enhanced performance in anomaly detection.
翻译:减少尺寸被认为是确保异常现象检测等未经监督的学习取得竞争性成绩的一个重要步骤。非负矩阵系数化(NMF)是实现这一目标的流行和广泛使用的方法。但是NMF没有包括邻里结构信息的规定,因此,在非线性多重结构存在的情况下,可能无法提供令人满意的业绩。为解决这一缺陷,我们提议通过用最小的横幅树模拟数据,在NMF框架内考虑并纳入邻里结构相似性信息。我们将由此产生的方法标为邻里结构辅助NMF。我们进一步设计了拟议方法的离线和在线算法版本。使用20个基准数据集和从水电厂提取的工业数据集进行的实证性比较显示了邻里结构的优越性,协助了NMF并支持了我们的功绩主张。我们更密切地研究邻里结构的构件和特性,协助了NMFF的近期强化版本。我们把邻里结构信息列为邻里结构中的一项关键内容,即利用MFT在发现异常现象检测方面实现强化性业绩。