Autonomous agents acting in the real-world often operate based on models that ignore certain aspects of the environment. The incompleteness of any given model -- handcrafted or machine acquired -- is inevitable due to practical limitations of any modeling technique for complex real-world settings. Due to the limited fidelity of its model, an agent's actions may have unexpected, undesirable consequences during execution. Learning to recognize and avoid such negative side effects of an agent's actions is critical to improve the safety and reliability of autonomous systems. Mitigating negative side effects is an emerging research topic that is attracting increased attention due to the rapid growth in the deployment of AI systems and their broad societal impacts. This article provides a comprehensive overview of different forms of negative side effects and the recent research efforts to address them. We identify key characteristics of negative side effects, highlight the challenges in avoiding negative side effects, and discuss recently developed approaches, contrasting their benefits and limitations. The article concludes with a discussion of open questions and suggestions for future research directions.
翻译:在现实世界中行事的自治代理人往往根据忽视环境某些方面的模式运作。任何特定模式 -- -- 手工艺或机器获得的模型 -- -- 的不完备性是不可避免的,因为复杂现实世界环境中任何模型技术的实际局限性。由于其模型的忠实性有限,代理人的行动在实施过程中可能产生出乎意料的不良后果。学会认识和避免代理人行动的这种消极副作用对于提高自主系统的安全和可靠性至关重要。减少消极副作用是一个新出现的研究课题,由于AI系统部署的迅速增长及其广泛的社会影响,正在引起越来越多的关注。本文章全面概述了不同形式的消极副作用以及最近的应对这些作用的研究工作。我们确定了消极副作用的主要特征,强调了在避免消极副作用方面的挑战,并讨论了最近制定的办法,对比了这些作用和局限性。文章最后讨论了关于未来研究方向的公开问题和建议。