Anomaly detection is an essential problem in machine learning. Application areas include network security, health care, fraud detection, etc., involving high-dimensional datasets. A typical anomaly detection system always faces the class-imbalance problem in the form of a vast difference in the sample sizes of different classes. They usually have class overlap problems. This study used a capsule network for the anomaly detection task. To the best of our knowledge, this is the first instance where a capsule network is analyzed for the anomaly detection task in a high-dimensional complex data setting. We also handle the related novelty and outlier detection problems. The architecture of the capsule network was suitably modified for a binary classification task. Capsule networks offer a good option for detecting anomalies due to the effect of viewpoint invariance captured in its predictions and viewpoint equivariance captured in internal capsule architecture. We used six-layered under-complete autoencoder architecture with second and third layers containing capsules. The capsules were trained using the dynamic routing algorithm. We created $10$-imbalanced datasets from the original MNIST dataset and compared the performance of the capsule network with $5$ baseline models. Our leading test set measures are F1-score for minority class and area under the ROC curve. We found that the capsule network outperformed every other baseline model on the anomaly detection task by using only ten epochs for training and without using any other data level and algorithm level approach. Thus, we conclude that capsule networks are excellent in modeling complex high-dimensional imbalanced datasets for the anomaly detection task.
翻译:异常检测是机器学习中的一个基本问题。 应用领域包括网络安全、 医疗保健、 欺诈检测等, 涉及高维数据集。 一个典型异常检测系统总是面临等级平衡问题, 其形式是不同类别样本大小的巨大差异。 它们通常有类重叠问题 。 此研究为异常检测任务使用了胶囊网络 。 根据我们所知, 这是第一个在高维复杂数据设置中分析异常检测任务的胶囊网络。 我们还处理相关的新颖和异常检测问题。 一个典型异常检测系统的结构经过适当的修改, 用于二维分类。 一个典型的异常检测系统总是面临阶级平衡问题, 其形式是不同类别样本大小的差别很大。 它们通常有不同类别。 我们使用六层的完整自动解密结构架构, 含有2层和3层的胶囊。 胶囊只是用动态路程算法来训练。 我们从最初的MNIST数据库中创建了10美元平衡的数据集, 并且比较了常规检测网络的运行情况, 我们用5美元基准模型, 我们找到了其他的实验室模型 。