Applications of machine learning in the non-profit and public sectors often feature an iterative workflow of data acquisition, prediction, and optimization of interventions. There are four major pain points that a machine learning pipeline must overcome in order to be actually useful in these settings: small data, data collected only under the default intervention, unmodeled objectives due to communication gap, and unforeseen consequences of the intervention. In this paper, we introduce bandit data-driven optimization, the first iterative prediction-prescription framework to address these pain points. Bandit data-driven optimization combines the advantages of online bandit learning and offline predictive analytics in an integrated framework. We propose PROOF, a novel algorithm for this framework and formally prove that it has no-regret. Using numerical simulations, we show that PROOF achieves superior performance than existing baseline. We also apply PROOF in a detailed case study of food rescue volunteer recommendation, and show that PROOF as a framework works well with the intricacies of ML models in real-world AI for non-profit and public sector applications.
翻译:在非营利和公共部门中,机器学习的应用往往具有数据获取、预测和干预措施优化的迭接工作流程。为了在这些环境中实际发挥作用,机器学习管道必须克服四个主要痛苦点:小数据,只在默认干预下收集的数据,由于通信差距而未改变的目标,以及干预的意外后果。在本文中,我们引入了土匪数据驱动优化,这是第一个解决这些痛苦点的迭接预测-规定框架。强盗数据驱动优化将在线土匪学习和离线预测解析在综合框架中的优势结合起来。我们提出这一框架的新算法,并正式证明它没有确定值。我们利用数字模拟,表明PROOF取得了优于现有基线的业绩。我们还应用了PROOF对食品救援志愿者建议的详细案例研究,并表明PROOF作为一个框架与实界AI中非盈利和公共部门应用的ML模型的内在复杂性非常有效。