通过有逻辑理由的统计学学习提高经证明的可靠性 (Improving Certified Robustness via Statistical Learning with Logical Reasoning)

Intensive algorithmic efforts have been made to enable the rapid improvements of certificated robustness for complex ML models recently. However, current robustness certification methods are only able to certify under a limited perturbation radius. Given that existing pure data-driven statistical approaches have reached a bottleneck, in this paper, we propose to integrate statistical ML models with knowledge (expressed as logical rules) as a reasoning component using Markov logic networks (MLN, so as to further improve the overall certified robustness. This opens new research questions about certifying the robustness of such a paradigm, especially the reasoning component (e.g., MLN). As the first step towards understanding these questions, we first prove that the computational complexity of certifying the robustness of MLN is #P-hard. Guided by this hardness result, we then derive the first certified robustness bound for MLN by carefully analyzing different model regimes. Finally, we conduct extensive experiments on five datasets including both high-dimensional images and natural language texts, and we show that the certified robustness with knowledge-based logical reasoning indeed significantly outperforms that of the state-of-the-art.

翻译：最近,为了迅速改进复杂 ML 模型的认证可靠性,我们进行了密集的算法努力,以便迅速改进复杂 ML 模型的认证可靠性。然而,目前的稳健性认证方法只能在有限的扰动半径范围内进行验证。鉴于现有的纯数据驱动统计方法已经到了瓶颈,我们在本文件中提议将具有知识的统计 ML 模型(以逻辑规则的形式表述)作为推理组成部分,使用Markov 逻辑网络(MLN,以进一步提高总的认证可靠性。这为验证这种模式的可靠性,特别是推理成分(例如MLN)带来了新的研究问题。作为理解这些问题的第一步,我们首先证明证明,证明 MLN 的稳健性的计算复杂性是 #P 硬的。以这种硬性结果为指导,我们随后通过仔细分析不同的模型制度,得出了MLN 的首个经认证的稳健性。最后,我们对五个数据集进行了广泛的实验,包括高度图像和自然语言文本,并且我们表明,经认证的基于知识的逻辑推理的可靠性确实大大超出状态。