We explore the promises and challenges of employing sequential decision-making algorithms - such as bandits, reinforcement learning, and active learning - in the public sector. While such algorithms have been heavily studied in settings that are suitable for the private sector (e.g., online advertising), the public sector could greatly benefit from these approaches, but poses unique methodological challenges for machine learning. We highlight several applications of sequential decision-making algorithms in regulation and governance, and discuss areas for further research which would enable them to be more widely applicable, fair, and effective. In particular, ensuring that these systems learn rational, causal decision-making policies can be difficult and requires great care. We also note the potential risks of such deployments and urge caution when conducting work in this area. We hope our work inspires more investigation of public-sector sequential decision making applications, which provide unique challenges for machine learning researchers and can be socially beneficial.
翻译:我们探讨公共部门采用先后决策算法的许诺和挑战,如强盗、强化学习和积极学习等,虽然在适合私营部门的环境(如在线广告)中已经对此类算法进行了大量研究,但公共部门可以从这些方法中大大受益,但对机器学习提出了独特的方法挑战。我们强调在监管和治理方面采用先后决策算法的几种应用,并讨论进一步研究的领域,以使其能够更加广泛适用、公平和有效。特别是,确保这些系统学习理性、因果决策政策可能很困难,需要非常谨慎。我们还注意到部署这些算法的潜在风险,并敦促在开展这方面的工作时谨慎行事。我们希望我们的工作能激发对公共部门按部顺序决策应用方法的更多调查,这些应用为机学习研究人员提供了独特的挑战,并有利于社会。