We introduce prediction-powered inference $\unicode{x2013}$ a framework for performing valid statistical inference when an experimental data set is supplemented with predictions from a machine-learning system. Our framework yields provably valid conclusions without making any assumptions on the machine-learning algorithm that supplies the predictions. Higher accuracy of the predictions translates to smaller confidence intervals, permitting more powerful inference. Prediction-powered inference yields simple algorithms for computing valid confidence intervals for statistical objects such as means, quantiles, and linear and logistic regression coefficients. We demonstrate the benefits of prediction-powered inference with data sets from proteomics, genomics, electronic voting, remote sensing, census analysis, and ecology.
翻译:我们引入了一个框架,用于在实验数据集得到机器学习系统预测的补充时,进行有效的统计推断。我们的框架在不对提供预测的机器学习算法做出任何假设的情况下得出可以论证的有效结论。预测的更准确性可以转换为更小的信任间隔,允许更强大的推理。预测力推导产生简单的算法,用于计算方法、定量、线性和后勤回归系数等统计对象的有效信任间隔。我们展示了预测力推导从蛋白质组学、基因组学、电子投票、遥感、普查分析和生态学等数据集中产生的惠益。