In classification problems, supervised machine-learning methods outperform traditional algorithms, thanks to the ability of neural networks to learn complex patterns. However, in two-class classification tasks like anomaly or fraud detection, unsupervised methods could do even better, because their prediction is not limited to previously learned types of anomalies. An intuitive approach of anomaly detection can be based on the distances from the centers of mass of the two respective classes. Autoencoders, although trained without supervision, can also detect anomalies: considering the center of mass of the normal points, reconstructions have now radii, with largest radii most likely indicating anomalous points. Of course, radii-based classification were already possible without interposing an autoencoder. In any space, radial classification can be operated, to some extent. In order to outperform it, we proceed to radial deformations of data (i.e. centric compression or expansions of axes) and autoencoder training. Any autoencoder that makes use of a data center is here baptized a centric autoencoder (cAE). A special type is the cAE trained with a uniformly compressed dataset, named the centripetal autoencoder (cpAE). The new concept is studied here in relation with a schematic artificial dataset, and the derived methods show consistent score improvements. But tested on real banking data, our radial deformation supervised algorithms alone still perform better that cAEs, as expected from most supervised methods; nonetheless, in hybrid approaches, cAEs can be combined with a radial deformation of space, improving its classification score. We expect that centric autoencoders will become irreplaceable objects in anomaly live detection based on geometry, thanks to their ability to stem naturally on geometrical algorithms and to their native capability of detecting unknown anomaly types.
翻译:在分类问题中,由于神经网络有能力学习复杂的模式,监督的机器学习方法比传统算法更符合传统算法。然而,在异常或欺诈检测等两大类分类任务中,未经监督的方法可以做得更好,因为其预测并不局限于先前所学的异常类型。异常检测的直觉方法可以基于与两个类质量中心的距离。尽管在没有监督的情况下受过培训,但Autoencoders也可以检测异常现象:考虑到正常点的质量中心,重建现在已经存在了radi,最大的辐射最有可能显示异常点。当然,基于radi的分类方法已经有可能做到,而无需插入一个自动coder。在任何空间中,可以操作辐射分类方法。为了超越它,我们着手进行数据异常变形变异(即中心压缩或扩展轴轴)和自动电解变法训练。任何使用数据中心的自动变异变变法都能够在这里进行自我洗,而内部变异的变异的变法则可以显示其不断变异的数据变异的变异性(A型) 和不断变的变异的数据变变的亚的亚的变变变的变变变的变法, 。