CausaLM: 通过反事实语言模型进行因果关系示范解释 (CausaLM: Causal Model Explanation Through Counterfactual Language Models)

Understanding predictions made by deep neural networks is notoriously difficult, but also crucial to their dissemination. As all machine learning based methods, they are as good as their training data, and can also capture unwanted biases. While there are tools that can help understand whether such biases exist, they do not distinguish between correlation and causation, and might be ill-suited for text-based models and for reasoning about high level language concepts. A key problem of estimating the causal effect of a concept of interest on a given model is that this estimation requires the generation of counterfactual examples, which is challenging with existing generation technology. To bridge that gap, we propose CausaLM, a framework for producing causal model explanations using counterfactual language representation models. Our approach is based on fine-tuning of deep contextualized embedding models with auxiliary adversarial tasks derived from the causal graph of the problem. Concretely, we show that by carefully choosing auxiliary adversarial pre-training tasks, language representation models such as BERT can effectively learn a counterfactual representation for a given concept of interest, and be used to estimate its true causal effect on model performance. A byproduct of our method is a language representation model that is unaffected by the tested concept, which can be useful in mitigating unwanted bias ingrained in the data.

翻译：深层神经网络的预测很难理解,但对于其传播也至关重要。所有基于机器的学习方法都与培训数据一样好,而且可以捕捉不必要的偏差。虽然有一些工具可以帮助理解是否存在这种偏差,但它们没有区分关联和因果关系,可能不适合基于文字的模式和高层次语言概念的推理。估计对某一模式感兴趣的概念的因果关系的一个关键问题是,这一估计需要生成反事实实例,而这种实例对现有的一代技术具有挑战性。为了缩小这一差距,我们建议CausaLM,这是一个利用反事实语言代表模型来产生因果关系示范解释的框架。我们的方法基于对深度背景化的嵌入模型进行微调,并辅之以从问题因果关系图中得出的辅助性对抗性任务。具体地说,我们表明,通过仔细选择辅助性对抗性培训前任务,像BERT这样的语言代表模式可以有效地学习对特定兴趣概念的反事实表述,并用来估计其对模型的实际因果关系。我们的方法的副产品是,一种在减轻风险时能够测试数据。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

GANs最新进展，30页ppt，GANs: the story so far

专知会员服务

43+阅读 · 2020年8月2日

知识图嵌入和可解释人工智能 Knowledge Graph Embeddings and Explainable AI

专知会员服务

135+阅读 · 2020年5月1日

因果图，Causal Graphs，52页ppt

专知会员服务

250+阅读 · 2020年4月19日

20篇「ACL2020」最新论文抢先看！看自然语言处理2020在研究什么？

专知会员服务

97+阅读 · 2020年4月10日