Decision making or scientific discovery pipelines such as job hiring and drug discovery often involve multiple stages: before any resource-intensive step, there is often an initial screening that uses predictions from a machine learning model to shortlist a few candidates from a large pool. We study screening procedures that aim to select candidates whose unobserved outcomes exceed user-specified values. We develop a method that wraps around any prediction model to produce a subset of candidates while controlling the proportion of falsely selected units. Building upon the conformal inference framework, our method first constructs p-values that quantify the statistical evidence for large outcomes; it then determines the shortlist by comparing the p-values to a threshold introduced in the multiple testing literature. In many cases, the procedure selects candidates whose predictions are above a data-dependent threshold. Our theoretical guarantee holds under mild exchangeability conditions on the samples, generalizing existing results on multiple conformal p-values. We demonstrate the empirical performance of our method via simulations, and apply it to job hiring and drug discovery datasets.
翻译:决策或科学发现管道,如招工和毒品发现,往往涉及多个阶段:在任何资源密集型步骤之前,往往先进行初步筛选,利用机器学习模型的预测,从大型人才库中将少数候选人列入入围名单;我们研究筛选程序,目的是挑选未观测到的结果超过用户指定值的候选人;我们开发一种方法,围绕任何预测模型来产生一组候选人,同时控制错误选择单位的比例;根据一致的推断框架,我们的方法首先构建了将统计证据量化为大结果的p值;然后通过将P值与多个测试文献中引入的阈值进行比较来确定短名单;在许多情况下,程序选择了预测高于数据依赖阈值的候选人;我们的理论保证在温和易交换条件下对样本进行调节,对多种符合的p值的现有结果进行概括;我们通过模拟来展示我们方法的经验性表现,并将其应用于招聘和药物发现数据集。