We introduce prediction-powered inference $\unicode{x2013}$ a framework for performing valid statistical inference when an experimental data set is supplemented with predictions from a machine-learning system such as AlphaFold. Our framework yields provably valid conclusions without making any assumptions on the machine-learning algorithm that supplies the predictions. Higher accuracy of the predictions translates to smaller confidence intervals, permitting more powerful inference. Prediction-powered inference yields simple algorithms for computing valid confidence intervals for statistical objects such as means, quantiles, and linear and logistic regression coefficients. We demonstrate the benefits of prediction-powered inference with data sets from proteomics, genomics, electronic voting, remote sensing, census analysis, and ecology.
翻译:我们引入了一个框架,用于在实验数据集得到阿尔法佛尔德等机器学习系统预测的补充时,进行有效的统计推断。我们的框架在不对提供预测的机器学习算法做出任何假设的情况下得出可以证明有效的结论。预测的更准确性可以转换为更小的信任期,允许更有力的推论。预测力推导产生简单的算法,用于计算诸如手段、昆虫、线性和后勤回归系数等统计物体的有效信任期。我们用来自蛋白质组、基因组、电子表决、遥感、普查分析和生态的数据集来证明预测力推论的好处。