Suppose that one can construct a valid $(1-\delta)$-confidence interval (CI) for each of $K$ parameters of potential interest. If a data analyst uses an arbitrary data-dependent criterion to select some subset $\mathcal{S}$ of parameters, then the aforementioned CIs for the selected parameters are no longer valid due to selection bias. We design a new method to adjust the intervals in order to control the false coverage rate (FCR). The main established method is the "BY procedure" by Benjamini and Yekutieli (JASA, 2005). Unfortunately, the BY guarantees require certain restrictions on the the selection criterion and on the dependence between the CIs. We propose a natural and much simpler method which is valid under any dependence structure between the original CIs, and any (unknown) selection criterion, but which only applies to a special, yet broad, class of CIs. Our procedure reports $(1-\delta|\mathcal{S}|/K)$-CIs for the selected parameters, and we prove that it controls the FCR at $\delta$ for confidence intervals that implicitly invert e-values; examples include those constructed via supermartingale methods, or via universal inference, or via Chernoff-style bounds on the moment generating function, among others. The e-BY procedure is admissible, and recovers the BY procedure as a special case via calibration. Our work also has implications for post-selection inference in sequential settings, since it applies at stopping times, to continuously-monitored confidence sequences, and under bandit sampling.
翻译:如果数据分析员使用任意的基于数据的标准来选择某些子集$\mathcal{S}的参数,那么上述选定参数的CI值由于选择偏差而不再有效。我们设计了一种新的方法来调整间隔,以控制虚假的覆盖率。主要既定方法是Benjami和Yekutieli的“BY程序”(2005年,日本空间局),不幸的是,Benjami和Yekutieli的保证要求对选择标准和CIers之间的依赖性作出某些限制。如果数据分析员使用任意的基于数据的标准来选择某些子集$\mathcal{S}S}参数的参数,那么由于选择偏差,上述选定参数的上述CI值不再有效。我们设计了一个新的方法来调整间隔,以控制错误的覆盖率(FCRB程序 ), 主要的既定方法是Benjani和Yekutieli(日本空间局, 2005年) 和Yekutiel CI 的保证要求对选择性标准进行一定的限制。我们提出了一种自然的简单方法,这些方法通过电子格式,这些方法通过电子格式, 也通过直路段进行。