Recent developments in the methods of explainable AI (XAI) allow researchers to explore the inner workings of deep neural networks (DNNs), revealing crucial information about input-output relationships and realizing how data connects with machine learning models. In this paper we explore interpretability of DNN models designed to identify jets coming from top quark decay in high energy proton-proton collisions at the Large Hadron Collider (LHC). We review a subset of existing top tagger models and explore different quantitative methods to identify which features play the most important roles in identifying the top jets. We also investigate how and why feature importance varies across different XAI metrics, how correlations among features impact their explainability, and how latent space representations encode information as well as correlate with physically meaningful quantities. Our studies uncover some major pitfalls of existing XAI methods and illustrate how they can be overcome to obtain consistent and meaningful interpretation of these models. We additionally illustrate the activity of hidden layers as Neural Activation Pattern (NAP) diagrams and demonstrate how they can be used to understand how DNNs relay information across the layers and how this understanding can help to make such models significantly simpler by allowing effective model reoptimization and hyperparameter tuning. These studies not only facilitate a methodological approach to interpreting models but also unveil new insights about what these models learn. Incorporating these observations into augmented model design, we propose the Particle Flow Interaction Network (PFIN) model and demonstrate how interpretability-inspired model augmentation can improve top tagging performance.
翻译:可以解释的AI(XAI)方法的最近发展使研究人员能够探索深神经网络的内部运行方式,揭示关于输入-输出关系的关键信息,并了解数据与机器学习模型之间的联系。在本文件中,我们探讨了DNN模型的可解释性,该模型旨在识别高能量质子-质质子(LHC)高能量碰撞中来自顶部夸克衰变的喷气机。我们审查了现有的一组顶级塔格模型,并探索了不同的定量方法,以确定哪些特征在确定顶级喷气机方面发挥着最重要的作用。我们还调查了如何和为什么在不同的XAI指标中具有不同的重要性,各特征之间的关联如何影响其解释,以及各种特征之间的关联是如何将信息与具有实际意义的数量相连接的。我们的研究揭示了现有XAI方法的一些重大缺陷,并说明了如何克服这些缺陷,以获得对这些模型的一致和有意义的解释。我们进一步说明了隐藏的模型层层的活动,如Neural Acenti-Plationality(NAP)图表,并展示如何利用它们来理解DNNNPNS在跨层观测中传递信息的方式,但又如何通过这种理解如何在更精确的模型上进行新的解释。我们如何能解释这些模型,以及如何帮助进行这种解释,如何进行这种解释,如何使这些更精确的模型进行这样的解释。