In the last years, AI safety gained international recognition in the light of heterogeneous safety-critical and ethical issues that risk overshadowing the broad beneficial impacts of AI. In this context, the implementation of AI observatory endeavors represents one key research direction. This paper motivates the need for an inherently transdisciplinary AI observatory approach integrating diverse retrospective and counterfactual views. We delineate aims and limitations while providing hands-on-advice utilizing concrete practical examples. Distinguishing between unintentionally and intentionally triggered AI risks with diverse socio-psycho-technological impacts, we exemplify a retrospective descriptive analysis followed by a retrospective counterfactual risk analysis. Building on these AI observatory tools, we present near-term transdisciplinary guidelines for AI safety. As further contribution, we discuss differentiated and tailored long-term directions through the lens of two disparate modern AI safety paradigms. For simplicity, we refer to these two different paradigms with the terms artificial stupidity (AS) and eternal creativity (EC) respectively. While both AS and EC acknowledge the need for a hybrid cognitive-affective approach to AI safety and overlap with regard to many short-term considerations, they differ fundamentally in the nature of multiple envisaged long-term solution patterns. By compiling relevant underlying contradistinctions, we aim to provide future-oriented incentives for constructive dialectics in practical and theoretical AI safety research.
翻译:在过去几年里,大赦国际的安全得到了国际上的承认,因为各种安全、关键和伦理问题都有可能掩盖大赦国际的广泛有益影响。在这方面,大赦国际观测站工作的实施代表了一个重要的研究方向。本文件提出需要一种内在的跨学科的大赦国际观测站方法,将各种不同的回顾和反事实观点结合起来。我们用具体的实际实例来说明目标和限制,同时提供实际操作咨询。区分无意和故意引发的大赦国际风险,产生不同的社会-心理-技术影响,我们举例说明了回顾性描述性分析,随后进行追溯性反实际风险分析。我们以大赦国际观测工具为基础,提出了近期的跨专业性大赦国际安全指南。作为进一步的贡献,我们通过两个截然不同的现代大赦国际安全范式的透镜讨论有区别和有针对性的长期方向。关于简洁,我们提到了这两个不同的模式,分别用人为的愚昧(AS)和永恒的创造力(EC)这两个术语。虽然AS和EC都承认有必要对大赦国际安全采取混合的认知和情感分析,同时对许多相关的短期考虑进行重叠。它们与许多相关的短期考虑有着根本的差异,但它们在实际的理论性研究性质上是不同的。