Intensive algorithmic efforts have been made to enable the rapid improvements of certificated robustness for complex ML models recently. However, current robustness certification methods are only able to certify under a limited perturbation radius. Given that existing pure data-driven statistical approaches have reached a bottleneck, in this paper, we propose to integrate statistical ML models with knowledge (expressed as logical rules) as a reasoning component using Markov logic networks (MLN, so as to further improve the overall certified robustness. This opens new research questions about certifying the robustness of such a paradigm, especially the reasoning component (e.g., MLN). As the first step towards understanding these questions, we first prove that the computational complexity of certifying the robustness of MLN is #P-hard. Guided by this hardness result, we then derive the first certified robustness bound for MLN by carefully analyzing different model regimes. Finally, we conduct extensive experiments on five datasets including both high-dimensional images and natural language texts, and we show that the certified robustness with knowledge-based logical reasoning indeed significantly outperforms that of the state-of-the-arts.
翻译:近年来,人们在复杂的机器学习模型上快速提升证明鲁棒性的算法方面做出了大量的努力。然而,目前的鲁棒性证明方法只能在有限的扰动半径下进行证明。鉴于现有的纯数据驱动统计方法已经达到瓶颈,本文提出使用马尔科夫逻辑网络(MLN)将统计机器学习模型与知识(以逻辑规则表达)整合为一种推理组件,从而进一步提高总体证明鲁棒性。这引出了关于证明这种范式的鲁棒性的新的研究问题,尤其是推理组件(例如MLN)的问题。为了了解这些问题的第一步,我们首先证明了证明MLN鲁棒性的计算复杂度是#P-难的。在这个难度结果的指导下,我们通过仔细分析不同的模型范围,推导出了MLN的首个鲁棒性证明上限。最后我们在包括高维图像和自然语言文本在内的五个数据集上进行了大量实验,并展示了基于知识的逻辑推理证明鲁棒性的表现显著优于现有最先进的方法。