Although a few methods have been developed recently for building confidence intervals after model selection, how to construct confidence sets for joint post-selection inference is still an open question. In this paper, we develop a new method to construct confidence sets after lasso variable selection, with strong numerical support for its accuracy and effectiveness. A key component of our method is to sample from the conditional distribution of the response $y$ given the lasso active set, which, in general, is very challenging due to the tiny probability of the conditioning event. We overcome this technical difficulty by using estimator augmentation to simulate from this conditional distribution via Markov chain Monte Carlo given any estimate $\tilde{\mu}$ of the mean $\mu_0$ of $y$. We then incorporate a randomization step for the estimate $\tilde{\mu}$ in our sampling procedure, which may be interpreted as simulating from a posterior predictive distribution by averaging over the uncertainty in $\mu_0$. Our Monte Carlo samples offer great flexibility in the construction of confidence sets for multiple parameters. Extensive numerical results show that our method is able to construct confidence sets with the desired coverage rate and, moreover, that the diameter and volume of our confidence sets are substantially smaller in comparison with a state-of-the-art method.
翻译:虽然最近为在模型选择后建立信任间隔制定了一些方法,但如何为选举后联合推断建立信任套仍然是个未决问题。 在本文中,我们开发了一种新方法,在拉索变量选择之后建立信任套,对它的准确性和有效性给予强有力的数字支持。我们方法的一个关键组成部分是,根据拉索活性集,从有条件的响应分布中抽取一个样本,如果考虑到拉索活性集,美元是有条件的,一般而言,由于调试事件的概率很小,这种分布非常具有挑战性。我们克服了这一技术困难,通过马可夫链蒙卡洛,根据平均值$\tilde_0美元的估计,从这一有条件的分配中模拟了估计增强。广泛的数字结果显示,我们的方法能够以预期的准确度和速度来构建信任度,因此,我们的方法与预期的准确度相比是更小的。