Propensity score methods have been shown to be powerful in obtaining efficient estimators of average treatment effect (ATE) from observational data, especially under the existence of confounding factors. When estimating, deciding which type of covariates need to be included in the propensity score function is important, since incorporating some unnecessary covariates may amplify both bias and variance of estimators of ATE. In this paper, we show that including additional instrumental variables that satisfy the exclusion restriction for outcome will do harm to the statistical efficiency. Also, we prove that, controlling for covariates that appear as outcome predictors, i.e. predict the outcomes and are irrelevant to the exposures, can help reduce the asymptotic variance of ATE estimation. We also note that, efficiently estimating the ATE by non-parametric or semi-parametric methods require the estimated propensity score function, as described in Hirano et al. (2003)\cite{Hirano2003}. Such estimation procedure usually asks for many regularity conditions, Rothe (2016)\cite{Rothe2016} also illustrated this point and proposed a known propensity score (KPS) estimator that requires mild regularity conditions and is still fully efficient. In addition, we introduce a linearly modified (LM) estimator that is nearly efficient in most general settings and need not estimation of the propensity score function, hence convenient to calculate. The construction of this estimator borrows idea from the interaction estimator of Lin (2013)\cite{Lin2013}, in which regression adjustment with interaction terms are applied to deal with data arising from a completely randomized experiment. As its name suggests, the LM estimator can be viewed as a linear modification on the IPW estimator using known propensity scores. We will also investigate its statistical properties both analytically and numerically.

0
下载
关闭预览

相关内容

IFIP TC13 Conference on Human-Computer Interaction是人机交互领域的研究者和实践者展示其工作的重要平台。多年来,这些会议吸引了来自几个国家和文化的研究人员。官网链接:http://interact2019.org/

Reinforcement learning (RL) experiments have notoriously high variance, and minor details can have disproportionately large effects on measured outcomes. This is problematic for creating reproducible research and also serves as an obstacle for real-world applications, where safety and predictability are paramount. In this paper, we investigate causes for this perceived instability. To allow for an in-depth analysis, we focus on a specifically popular setup with high variance -- continuous control from pixels with an actor-critic agent. In this setting, we demonstrate that variance mostly arises early in training as a result of poor "outlier" runs, but that weight initialization and initial exploration are not to blame. We show that one cause for early variance is numerical instability which leads to saturating nonlinearities. We investigate several fixes to this issue and find that one particular method is surprisingly effective and simple -- normalizing penultimate features. Addressing the learning instability allows for larger learning rates, and significantly decreases the variance of outcomes. This demonstrates that the perceived variance in RL is not necessarily inherent to the problem definition and may be addressed through simple architectural modifications.

0
0
下载
预览

Hierarchical inference in (generalized) regression problems is powerful for finding significant groups or even single covariates, especially in high-dimensional settings where identifiability of the entire regression parameter vector may be ill-posed. The general method proceeds in a fully data-driven and adaptive way from large to small groups or singletons of covariates, depending on the signal strength and the correlation structure of the design matrix. We propose a novel hierarchical multiple testing adjustment that can be used in combination with any significance test for a group of covariates to perform hierarchical inference. Our adjustment passes on the significance level of certain hypotheses that could not be rejected and is shown to guarantee strong control of the familywise error rate. Our method is at least as powerful as a so-called depth-wise hierarchical Bonferroni adjustment. It provides a substantial gain in power over other previously proposed inheritance hierarchical procedures if the underlying alternative hypotheses occur sparsely along a few branches in the tree-structured hierarchy.

0
0
下载
预览

Generalised Bayesian inference updates prior beliefs using a loss function, rather than a likelihood, and can therefore be used to confer robustness against possible mis-specification of the likelihood. Here we consider generalised Bayesian inference with a Stein discrepancy as a loss function, motivated by applications in which the likelihood contains an intractable normalisation constant. In this context, the Stein discrepancy circumvents evaluation of the normalisation constant and produces generalised posteriors that are either closed form or accessible using standard Markov chain Monte Carlo. On a theoretical level, we show consistency, asymptotic normality, and bias-robustness of the generalised posterior, highlighting how these properties are impacted by the choice of Stein discrepancy. Then, we provide numerical experiments on a range of intractable distributions, including applications to kernel-based exponential family models and non-Gaussian graphical models.

0
0
下载
预览

Cluster randomized trials (CRTs) randomly assign an intervention to groups of individuals (e.g., clinics or communities) and measure outcomes on individuals in those groups. While offering many advantages, this experimental design introduces challenges that are only partially addressed by existing analytic approaches. First, outcomes are often missing for some individuals within clusters. Failing to appropriately adjust for differential outcome measurement can result in biased estimates and inference. Second, CRTs often randomize limited numbers of clusters, resulting in chance imbalances on baseline outcome predictors between arms. Failing to adaptively adjust for these imbalances and other predictive covariates can result in efficiency losses. To address these methodological gaps, we propose and evaluate a novel two-stage targeted minimum loss-based estimator (TMLE) to adjust for baseline covariates in a manner that optimizes precision, after controlling for baseline and post-baseline causes of missing outcomes. Finite sample simulations illustrate that our approach can nearly eliminate bias due to differential outcome measurement, while existing CRT estimators yield misleading results and inferences. Application to real data from the SEARCH community randomized trial demonstrates the gains in efficiency afforded through adaptive adjustment for baseline covariates, after controlling for missingness on individual-level outcomes.

0
0
下载
预览

Fine stratification is a popular design as it permits the stratification to be carried out to the fullest possible extent. Some examples include the Current Population Survey and National Crime Victimization Survey both conducted by the U.S. Census Bureau, and the National Survey of Family Growth conducted by the University of Michigan's Institute for Social Research. Clearly, the fine stratification survey has proved useful in many applications as its point estimator is unbiased and efficient. A common practice to estimate the variance in this context is collapsing the adjacent strata to create pseudo-strata and then estimating the variance, but the attained estimator of variance is not design-unbiased, and the bias increases as the population means of the pseudo-strata become more variant. Additionally, the estimator may suffer from a large mean squared error (MSE). In this paper, we propose a hierarchical Bayesian estimator for the variance of collapsed strata and compare the results with a nonparametric Bayes variance estimator. Additionally, we make comparisons with a kernel-based variance estimator recently proposed by Breidt et al. (2016). We show our proposed estimator is superior compared to the alternatives given in the literature such that it has a smaller frequentist MSE and bias. We verify this throughout multiple simulation studies and data analysis from the 2007-8 National Health and Nutrition Examination Survey and the 1998 Survey of Mental Health Organizations.

0
0
下载
预览

In observational studies, causal inference relies on several key identifying assumptions. One identifiability condition is the positivity assumption, which requires the probability of treatment be bounded away from 0 and 1. That is, for every covariate combination, it should be possible to observe both treated and control subjects, i.e., the covariate distributions should overlap between treatment arms. If the positivity assumption is violated, population-level causal inference necessarily involves some extrapolation. Ideally, a greater amount of uncertainty about the causal effect estimate should be reflected in such situations. With that goal in mind, we construct a Gaussian process model for estimating treatment effects in the presence of practical violations of positivity. Advantages of our method include minimal distributional assumptions, a cohesive model for estimating treatment effects, and more uncertainty associated with areas in the covariate space where there is less overlap. We assess the performance of our approach with respect to bias and efficiency using simulation studies. The method is then applied to a study of critically ill female patients to examine the effect of undergoing right heart catheterization.

0
0
下载
预览

Randomization tests are based on a re-randomization of existing data to gain data-dependent critical values that lead to exact hypothesis tests under special circumstances. However, it is not always possible to re-randomize data in accordance to the physical randomization from which the data has been obained. As a consequence, most statistical tests cannot control the type I error probability. Still, similarly as the bootstrap, data re-randomization can be used to improve the type I error control. However, no general asymptotic theory under weak null hypotheses has been developed for such randomization tests yet. It is the aim of this paper to provide a conveniently applicable theory on the asymptotic validity of randomization tests with asymptotically normal test statistics. Similarly, confidence intervals will be developed. This will be achieved by creating a link between two well-established fields in mathematical statistics: empirical processes and inference based on randomization via algebraic groups. A broadly applicable conditional weak convergence theorem is developed for empirical processes that are based on randomized observations. Random elements of an algebraic group are applied to the data vectors from which the randomized version of a statistic is derived. Combining a variant of the functional delta-method with a suitable studentization of the statistic, asymptotically exact hypothesis tests is deduced, while the finite sample exactness property under group-invariant sub-hypotheses is preserved. The methodology is exemplified with: the Pearson correlation coefficient, a Mann-Whitney effect based on right-censored paired data, and a competing risks analysis. The practical usefulness of the approaches is assessed through simulation studies and an application to data from patients suffering from diabetic retinopathy.

0
0
下载
预览

Monte Carlo estimation in plays a crucial role in stochastic reaction networks. However, reducing the statistical uncertainty of the corresponding estimators requires sampling a large number of trajectories. We propose control variates based on the statistical moments of the process to reduce the estimators' variances. We develop an algorithm that selects an efficient subset of infinitely many control variates. To this end, the algorithm uses resampling and a redundancy-aware greedy selection. We demonstrate the efficiency of our approach in several case studies.

0
0
下载
预览

Causal inference has been increasingly reliant on observational studies with rich covariate information. To build tractable causal procedures, such as the doubly robust estimators, it is imperative to first extract important features from high or even ultra-high dimensional data. In this paper, we propose the causal ball screening for confounder selection from modern ultra-high dimensional data sets. Unlike the familiar task of variable selection for prediction modeling, our confounder selection procedure aims to control for confounding while improving efficiency in the resulting causal effect estimate. Previous empirical and theoretical studies imply that one should exclude causes of the treatment that are not confounders. Motivated by these results, our goal is to keep all the predictors of the outcome in both the propensity score and outcome regression models. A distinctive feature of our proposal is that we use an outcome model-free procedure for propensity score model selection, thereby maintaining double robustness in the resulting causal effect estimator. Our theoretical analyses show that the proposed procedure enjoys a number of properties, including model selection consistency, normality and efficiency. Synthetic and real data analyses show that our proposal performs favorably with existing methods in a range of realistic settings.

0
0
下载
预览

Causal inference is a critical research topic across many domains, such as statistics, computer science, education, public policy and economics, for decades. Nowadays, estimating causal effect from observational data has become an appealing research direction owing to the large amount of available data and low budget requirement, compared with randomized controlled trials. Embraced with the rapidly developed machine learning area, various causal effect estimation methods for observational data have sprung up. In this survey, we provide a comprehensive review of causal inference methods under the potential outcome framework, one of the well known causal inference framework. The methods are divided into two categories depending on whether they require all three assumptions of the potential outcome framework or not. For each category, both the traditional statistical methods and the recent machine learning enhanced methods are discussed and compared. The plausible applications of these methods are also presented, including the applications in advertising, recommendation, medicine and so on. Moreover, the commonly used benchmark datasets as well as the open-source codes are also summarized, which facilitate researchers and practitioners to explore, evaluate and apply the causal inference methods.

0
78
下载
预览
小贴士
相关论文
Johan Bjorck,Carla P. Gomes,Kilian Q. Weinberger
0+阅读 · 10月21日
Claude Renaux,Peter Bühlmann
0+阅读 · 10月21日
Takuo Matsubara,Jeremias Knoblauch,François-Xavier Briol,Chris. J. Oates
0+阅读 · 10月21日
Laura B. Balzer,Mark van der Laan,James Ayieko,Moses Kamya,Gabriel Chamie,Joshua Schwab,Diane V. Havlir,Maya L. Petersen
0+阅读 · 10月20日
Sepideh Mosaferi
0+阅读 · 10月19日
Michael Backenköhler,Luca Bortolussi,Verena Wolf
0+阅读 · 10月18日
Dingke Tang,Dehan Kong,Wenliang Pan,Linbo Wang
0+阅读 · 10月16日
Liuyi Yao,Zhixuan Chu,Sheng Li,Yaliang Li,Jing Gao,Aidong Zhang
78+阅读 · 2020年2月5日
相关VIP内容
专知会员服务
34+阅读 · 8月27日
专知会员服务
37+阅读 · 3月16日
专知会员服务
37+阅读 · 2020年12月14日
因果图,Causal Graphs,52页ppt
专知会员服务
144+阅读 · 2020年4月19日
强化学习最新教程,17页pdf
专知会员服务
69+阅读 · 2019年10月11日
2019年机器学习框架回顾
专知会员服务
25+阅读 · 2019年10月11日
相关资讯
已删除
inpluslab
8+阅读 · 2019年10月29日
A Technical Overview of AI & ML in 2018 & Trends for 2019
待字闺中
10+阅读 · 2018年12月24日
Hierarchical Disentangled Representations
CreateAMind
3+阅读 · 2018年4月15日
【论文】变分推断(Variational inference)的总结
机器学习研究会
24+阅读 · 2017年11月16日
【学习】Hierarchical Softmax
机器学习研究会
3+阅读 · 2017年8月6日
Top