The current state of the art systems in Artificial Intelligence (AI) enabled intrusion detection use a variety of black box methods. These black box methods are generally trained using Error Based Learning (EBL) techniques with a focus on creating accurate models. These models have high performative costs and are not easily explainable. A white box Competitive Learning (CL) based eXplainable Intrusion Detection System (X-IDS) offers a potential solution to these problem. CL models utilize an entirely different learning paradigm than EBL approaches. This different learning process makes the CL family of algorithms innately explainable and less resource intensive. In this paper, we create an X-IDS architecture that is based on DARPA's recommendation for explainable systems. In our architecture we leverage CL algorithms like, Self Organizing Maps (SOM), Growing Self Organizing Maps (GSOM), and Growing Hierarchical Self Organizing Map (GHSOM). The resulting models can be data-mined to create statistical and visual explanations. Our architecture is tested using NSL-KDD and CIC-IDS-2017 benchmark datasets, and produces accuracies that are 1% - 3% less than EBL models. However, CL models are much more explainable than EBL models. Additionally, we use a pruning process that is able to significantly reduce the size of these CL based models. By pruning our models, we are able to increase prediction speeds. Lastly, we analyze the statistical and visual explanations generated by our architecture, and we give a strategy that users could use to help navigate the set of explanations. These explanations will help users build trust with an Intrusion Detection System (IDS), and allow users to discover ways to increase the IDS's potency.
翻译:当前,人工智能(AI)启用的入侵检测系统使用各种黑盒方法。这些黑盒方法通常使用基于误差的学习(EBL)技术进行训练,重点是创建准确的模型。这些模型具有高性能成本,难以解释。白盒基于竞争学习(CL)的可解释入侵检测系统(X-IDS)为解决这些问题提供了潜在的解决方案。CL模型使用完全不同于EBL方法的学习范例。这种不同的学习过程使CL算法族天生具有解释性和更少的资源消耗。在本文中,我们创建了一种X-IDS架构,该架构基于DARPA对可解释系统的建议。在我们的架构中,我们利用类似于自组织地图(SOM)、成长自组织地图(GSOM)和成长分层自组织地图(GHSOM)的CL算法。生成的模型可以进行数据挖掘,以创建统计和可视化的解释。我们使用NSL-KDD和CIC-IDS-2017基准数据集测试我们的架构,并产生比EBL模型少1%-3%的准确性。然而,CL模型比EBL模型具有更好的可解释性。此外,我们使用一种修剪过程,能够显著减少这些基于CL的模型的大小。通过修剪我们的模型,我们能够提高预测速度。最后,我们分析了我们架构生成的统计和可视化的解释,并给出了一种策略,用户可以使用该策略来帮助导航解释集。这些解释将帮助用户建立对入侵检测系统(IDS)的信任,并允许用户发现如何增加IDS的效力。