The aim of this paper is to offer the first systematic exploration and definition of equivalent causal models in the context where both models are not made up of the same variables. The idea is that two models are equivalent when they agree on all "essential" causal information that can be expressed using their common variables. I do so by focussing on the two main features of causal models, namely their structural relations and their functional relations. In particular, I define several relations of causal ancestry and several relations of causal sufficiency, and require that the most general of these relations are preserved across equivalent models.

4
下载
关闭预览

相关内容

ACM/IEEE第23届模型驱动工程语言和系统国际会议,是模型驱动软件和系统工程的首要会议系列,由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来,模型涵盖了建模的各个方面,从语言和方法到工具和应用程序。模特的参加者来自不同的背景,包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛,参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会,并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。 官网链接:http://www.modelsconference.org/

Using deep latent variable models in causal inference has attracted considerable interest recently, but an essential open question is their identifiability. While they have yielded promising results and theory exists on the identifiability of some simple model formulations, we also know that causal effects cannot be identified in general with latent variables. We investigate this gap between theory and empirical results with theoretical considerations and extensive experiments under multiple synthetic and real-world data sets, using the causal effect variational autoencoder (CEVAE) as a case study. While CEVAE seems to work reliably under some simple scenarios, it does not identify the correct causal effect with a misspecified latent variable or a complex data distribution, as opposed to the original goals of the model. Our results show that the question of identifiability cannot be disregarded, and we argue that more attention should be paid to it in future work.

0
0
下载
预览

Latent variable models (LVMs) are probabilistic models where some of the variables are hidden during training. A broad class of LVMshave a directed acyclic graphical structure. The directed structure suggests an intuitive causal explanation of the data generating process. For example, a latent topic model suggests that topics cause the occurrence of a token. Despite this intuitive causal interpretation, a directed acyclic latent variable model trained on data is generally insufficient for causal reasoning, as the required model parameters may not be uniquely identified. In this manuscript we demonstrate that an LVM can answer any causal query posed post-training, provided that the query can be identified from the observed variables according to the do-calculus rules. We show that causal reasoning can enhance a broad class of LVM long established in the probabilistic modeling community, and demonstrate its effectiveness on several case studies. These include a machine learning model with multiple causes where there exists a set of latent confounders and a mediator between the causes and the outcome variable, a study where the identifiable causal query cannot be estimated using the front-door or back-door criterion, a case study that captures unobserved crosstalk between two biological signaling pathways, and a COVID-19 expert system that identifies multiple causal queries.

0
0
下载
预览

The Univalence Principle is the statement that equivalent mathematical structures are indistinguishable. We prove a general version of this principle that applies to all set-based, categorical, and higher-categorical structures defined in a non-algebraic and space-based style, as well as models of higher-order theories such as topological spaces. In particular, we formulate a general definition of indiscernibility for objects of any such structure, and a corresponding univalence condition that generalizes Rezk's completeness condition for Segal spaces and ensures that all equivalences of structures are levelwise equivalences. Our work builds on Makkai's First-Order Logic with Dependent Sorts, but is expressed in Voevodsky's Univalent Foundations (UF), extending previous work on the Structure Identity Principle and univalent categories in UF. This enables indistinguishability to be expressed simply as identification, and yields a formal theory that is interpretable in classical homotopy theory, but also in other higher topos models. It follows that Univalent Foundations is a fully equivalence-invariant foundation for higher-categorical mathematics, as intended by Voevodsky.

0
0
下载
预览

This paper aims to contribute to helping practitioners of causal mediation analysis gain a better understanding of estimation options. We take as inputs two familiar strategies (weighting and model-based prediction) and a simple way of combining them (weighted models), and show how we can generate a range of estimators with different modeling requirements and robustness properties. The primary goal is to help build intuitive appreciation for robust estimation that is conducive to sound practice that does not require advanced statistical knowledge. A second goal is to provide a "menu" of estimators that practitioners can choose from for the estimation of marginal natural (in)direct effects. The estimators generated from this exercise include some that coincide or are similar to existing estimators and others that have not appeared in the literature. We use a random continuous weights bootstrap to obtain confidence intervals, and also derive general asymptotic (sandwich) variance formulas for the estimators. The estimators are illustrated using data from an adolescent alcohol use prevention study.

0
0
下载
预览

Many methods now exist for conditioning model outputs on task instructions, retrieved documents, and user-provided explanations and feedback. Rather than relying solely on examples of task inputs and outputs, these approaches use valuable additional data for improving model correctness and aligning learned models with human priors. Meanwhile, a growing body of evidence suggests that some language models can (1) store a large amount of knowledge in their parameters, and (2) perform inference over tasks in textual inputs at test time. These results raise the possibility that, for some tasks, humans cannot explain to a model any more about the task than it already knows or could infer on its own. In this paper, we study the circumstances under which explanations of individual data points can (or cannot) improve modeling performance. In order to carefully control important properties of the data and explanations, we introduce a synthetic dataset for experiments, and we also make use of three existing datasets with explanations: e-SNLI, TACRED, and SemEval. We first give a formal framework for the available modeling approaches, in which explanation data can be used as model inputs, as targets, or as a prior. After arguing that the most promising role for explanation data is as model inputs, we propose to use a retrieval-based method and show that it solves our synthetic task with accuracies upwards of 95%, while baselines without explanation data achieve below 65% accuracy. We then identify properties of datasets for which retrieval-based modeling fails. With the three existing datasets, we find no improvements from explanation retrieval. Drawing on findings from our synthetic task, we suggest that at least one of six preconditions for successful modeling fails to hold with these datasets. Our code is publicly available at https://github.com/peterbhase/ExplanationRoles

0
0
下载
预览

This paper focuses on the expected difference in borrower's repayment when there is a change in the lender's credit decisions. Classical estimators overlook the confounding effects and hence the estimation error can be magnificent. As such, we propose another approach to construct the estimators such that the error can be greatly reduced. The proposed estimators are shown to be unbiased, consistent, and robust through a combination of theoretical analysis and numerical testing. Moreover, we compare the power of estimating the causal quantities between the classical estimators and the proposed estimators. The comparison is tested across a wide range of models, including linear regression models, tree-based models, and neural network-based models, under different simulated datasets that exhibit different levels of causality, different degrees of nonlinearity, and different distributional properties. Most importantly, we apply our approaches to a large observational dataset provided by a global technology firm that operates in both the e-commerce and the lending business. We find that the relative reduction of estimation error is strikingly substantial if the causal effects are accounted for correctly.

0
11
下载
预览

Causal inference is a critical research topic across many domains, such as statistics, computer science, education, public policy and economics, for decades. Nowadays, estimating causal effect from observational data has become an appealing research direction owing to the large amount of available data and low budget requirement, compared with randomized controlled trials. Embraced with the rapidly developed machine learning area, various causal effect estimation methods for observational data have sprung up. In this survey, we provide a comprehensive review of causal inference methods under the potential outcome framework, one of the well known causal inference framework. The methods are divided into two categories depending on whether they require all three assumptions of the potential outcome framework or not. For each category, both the traditional statistical methods and the recent machine learning enhanced methods are discussed and compared. The plausible applications of these methods are also presented, including the applications in advertising, recommendation, medicine and so on. Moreover, the commonly used benchmark datasets as well as the open-source codes are also summarized, which facilitate researchers and practitioners to explore, evaluate and apply the causal inference methods.

0
78
下载
预览

Recent progress in pretraining language models on large textual corpora led to a surge of improvements for downstream NLP tasks. Whilst learning linguistic knowledge, these models may also be storing relational knowledge present in the training data, and may be able to answer queries structured as "fill-in-the-blank" cloze statements. Language models have many advantages over structured knowledge bases: they require no schema engineering, allow practitioners to query about an open class of relations, are easy to extend to more data, and require no human supervision to train. We present an in-depth analysis of the relational knowledge already present (without fine-tuning) in a wide range of state-of-the-art pretrained language models. We find that (i) without fine-tuning, BERT contains relational knowledge competitive with traditional NLP methods that have some access to oracle knowledge, (ii) BERT also does remarkably well on open-domain question answering against a supervised baseline, and (iii) certain types of factual knowledge are learned much more readily than others by standard language model pretraining approaches. The surprisingly strong ability of these models to recall factual knowledge without any fine-tuning demonstrates their potential as unsupervised open-domain QA systems. The code to reproduce our analysis is available at https://github.com/facebookresearch/LAMA.

0
5
下载
预览

In this paper, we propose Latent Relation Language Models (LRLMs), a class of language models that parameterizes the joint distribution over the words in a document and the entities that occur therein via knowledge graph relations. This model has a number of attractive properties: it not only improves language modeling performance, but is also able to annotate the posterior probability of entity spans for a given text through relations. Experiments demonstrate empirical improvements over both a word-based baseline language model and a previous approach that incorporates knowledge graph information. Qualitative analysis further demonstrates the proposed model's ability to learn to predict appropriate relations in context.

0
19
下载
预览

A fundamental computation for statistical inference and accurate decision-making is to compute the marginal probabilities or most probable states of task-relevant variables. Probabilistic graphical models can efficiently represent the structure of such complex data, but performing these inferences is generally difficult. Message-passing algorithms, such as belief propagation, are a natural way to disseminate evidence amongst correlated variables while exploiting the graph structure, but these algorithms can struggle when the conditional dependency graphs contain loops. Here we use Graph Neural Networks (GNNs) to learn a message-passing algorithm that solves these inference tasks. We first show that the architecture of GNNs is well-matched to inference tasks. We then demonstrate the efficacy of this inference approach by training GNNs on a collection of graphical models and showing that they substantially outperform belief propagation on loopy graphs. Our message-passing algorithms generalize out of the training set to larger graphs and graphs with different structure.

0
3
下载
预览
小贴士
相关论文
Sara Mohammad-Taheri,Robert Ness,Jeremy Zucker,Olga Vitek
0+阅读 · 2月12日
Benedikt Ahrens,Paige Randall North,Michael Shulman,Dimitris Tsementzis
0+阅读 · 2月11日
Trang Quynh Nguyen,Elizabeth B. Sarker,Ian Schmid,Noah Greifer,Elizabeth L. Ogburn,Ina M. Koning,Elizabeth A. Stuart
0+阅读 · 2月11日
Yiyan Huang,Cheuk Hang Leung,Xing Yan,Qi Wu,Nanbo Peng,Dongdong Wang,Zhixiang Huang
11+阅读 · 2020年12月17日
Liuyi Yao,Zhixuan Chu,Sheng Li,Yaliang Li,Jing Gao,Aidong Zhang
78+阅读 · 2020年2月5日
Fabio Petroni,Tim Rocktäschel,Patrick Lewis,Anton Bakhtin,Yuxiang Wu,Alexander H. Miller,Sebastian Riedel
5+阅读 · 2019年9月4日
Hiroaki Hayashi,Zecong Hu,Chenyan Xiong,Graham Neubig
19+阅读 · 2019年8月21日
KiJung Yoon,Renjie Liao,Yuwen Xiong,Lisa Zhang,Ethan Fetaya,Raquel Urtasun,Richard Zemel,Xaq Pitkow
3+阅读 · 2018年5月25日
相关VIP内容
专知会员服务
60+阅读 · 2020年11月20日
专知会员服务
39+阅读 · 2020年10月24日
Fariz Darari简明《博弈论Game Theory》介绍,35页ppt
专知会员服务
65+阅读 · 2020年5月15日
专知会员服务
114+阅读 · 2020年4月21日
因果图,Causal Graphs,52页ppt
专知会员服务
144+阅读 · 2020年4月19日
MIT新书《强化学习与最优控制》
专知会员服务
143+阅读 · 2019年10月9日
最新BERT相关论文清单,BERT-related Papers
专知会员服务
37+阅读 · 2019年9月29日
相关资讯
ICLR2019最佳论文出炉
专知
11+阅读 · 2019年5月6日
A Technical Overview of AI & ML in 2018 & Trends for 2019
待字闺中
10+阅读 · 2018年12月24日
Disentangled的假设的探讨
CreateAMind
7+阅读 · 2018年12月10日
disentangled-representation-papers
CreateAMind
23+阅读 · 2018年9月12日
Hierarchical Disentangled Representations
CreateAMind
3+阅读 · 2018年4月15日
【推荐】自然语言处理(NLP)指南
机器学习研究会
33+阅读 · 2017年11月17日
【论文】变分推断(Variational inference)的总结
机器学习研究会
24+阅读 · 2017年11月16日
可解释的CNN
CreateAMind
13+阅读 · 2017年10月5日
Auto-Encoding GAN
CreateAMind
5+阅读 · 2017年8月4日
Top