This paper introduces weighted conformal p-values for model-free selective inference. Assume we observe units with covariates $X$ and missing responses $Y$, the goal is to select those units whose responses are larger than some user-specified values while controlling the proportion of falsely selected units. We extend [JC22] to situations where there is a covariate shift between training and test samples, while making no modeling assumptions on the data, and having no restrictions on the model used to predict the responses. Using any predictive model, we first construct well-calibrated weighted conformal p-values, which control the type-I error in detecting a large response/outcome for each single unit. However, a well known positive dependence property between the p-values can be violated due to covariate-dependent weights, which complicates the use of classical multiple testing procedures. This is why we introduce weighted conformalized selection (WCS), a new multiple testing procedure which leverages a special conditional independence structure implied by weighted exchangeability to achieve FDR control in finite samples. Besides prediction-assisted candidate screening, we study how WCS (1) allows to conduct simultaneous inference on multiple individual treatment effects, and (2) extends to outlier detection when the distribution of reference inliers shifts from test inliers. We demonstrate performance via simulations and apply WCS to causal inference, drug discovery, and outlier detection datasets.
翻译:暂无翻译