In this paper, a novel framework for anomaly estimation is proposed. The basic idea behind our method is to reduce the data into a two-dimensional space and then rank each data point in the reduced space. We attempt to estimate the degree of anomaly in both spatial and density domains. Specifically, we transform the data points into a density space and measure the distances in density domain between each point and its k-Nearest Neighbors in spatial domain. Then, an anomaly coordinate system is built by collecting two unilateral anomalies from k-nearest neighbors of each point. Further more, we introduce two schemes to model their correlation and combine them to get the final anomaly score. Experiments performed on the synthetic and real world datasets demonstrate that the proposed method performs well and achieve highest average performance. We also show that the proposed method can provide visualization and classification of the anomalies in a simple manner. Due to the complexity of the anomaly, none of the existing methods can perform best on all benchmark datasets. Our method takes into account both the spatial domain and the density domain and can be adapted to different datasets by adjusting a few parameters manually.
翻译:本文提出了异常点估计的新框架。 我们的方法的基本理念是将数据降为二维空间, 然后将每个数据点排在缩小的空间中。 我们试图估算空间和密度域的异常程度。 具体地说, 我们将数据点转换成密度空间, 测量每个点及其空间域的 k- 最近邻之间的密度区域距离。 然后, 从每个点的 k 最近邻收集两个单方异常点, 从而建立一个异常点协调系统。 此外, 我们引入了两个方案来模拟它们的相关性, 并结合它们以获得最后异常分数。 在合成和真实世界数据集上进行的实验表明, 拟议的方法运行良好, 并实现了最高的平均性能。 我们还表明, 拟议的方法能够以简单的方式提供异常点的可视化和分类。 由于异常点的复杂性, 任何现有方法都无法在所有基准数据集上取得最佳效果。 我们的方法既考虑到空间域, 也考虑到密度域, 也可以通过手动调整几个参数来适应不同的数据集 。