值得信赖的大赦国际:从原则到实践 (Trustworthy AI: From Principles to Practices)

Fast developing artificial intelligence (AI) technology has enabled various applied systems deployed in the real world, impacting people's everyday lives. However, many current AI systems were found vulnerable to imperceptible attacks, biased against underrepresented groups, lacking in user privacy protection, etc., which not only degrades user experience but erodes the society's trust in all AI systems. In this review, we strive to provide AI practitioners a comprehensive guide towards building trustworthy AI systems. We first introduce the theoretical framework of important aspects of AI trustworthiness, including robustness, generalization, explainability, transparency, reproducibility, fairness, privacy preservation, alignment with human values, and accountability. We then survey leading approaches in these aspects in the industry. To unify the current fragmented approaches towards trustworthy AI, we propose a systematic approach that considers the entire lifecycle of AI systems, ranging from data acquisition to model development, to development and deployment, finally to continuous monitoring and governance. In this framework, we offer concrete action items to practitioners and societal stakeholders (e.g., researchers and regulators) to improve AI trustworthiness. Finally, we identify key opportunities and challenges in the future development of trustworthy AI systems, where we identify the need for paradigm shift towards comprehensive trustworthy AI systems.

翻译：快速发展人工智能(AI)技术使在现实世界中部署的各种应用系统得以实现,影响到人们的日常生活。然而,许多现有的人工智能系统被认为容易受到无法察觉的攻击,对代表性不足的群体有偏见,缺乏用户隐私保护等,这不仅会降低用户的经验,而且会损害社会对所有人工智能系统的信任。在本次审查中,我们努力为AI从业者提供建立可信赖的人工智能系统的全面指南。我们首先引入了AI可信赖性重要方面的理论框架,包括稳健性、普遍性、可解释性、透明度、再生性、公平性、隐私保护、与人类价值观和问责。我们然后调查这些行业中的主要做法。为了统一目前对可信赖的AI系统采取的支离破碎的做法,我们提出了一种系统性的方法,考虑到AI系统的整个生命周期,从数据获取到模型开发、开发和部署,到持续监测和治理等。我们在此框架内,向实践者和社会利益攸关方(例如研究人员和监管者)提供了具体行动项目,以提高AI的可信度。最后,我们确定了未来发展可信赖的AI系统所需的可靠性模式的关键机会和挑战。