Context: Machine Learning (ML) has been at the heart of many innovations over the past years. However, including it in so-called 'safety-critical' systems such as automotive or aeronautic has proven to be very challenging, since the shift in paradigm that ML brings completely changes traditional certification approaches. Objective: This paper aims to elucidate challenges related to the certification of ML-based safety-critical systems, as well as the solutions that are proposed in the literature to tackle them, answering the question 'How to Certify Machine Learning Based Safety-critical Systems?'. Method: We conduct a Systematic Literature Review (SLR) of research papers published between 2015 to 2020, covering topics related to the certification of ML systems. In total, we identified 217 papers covering topics considered to be the main pillars of ML certification: Robustness, Uncertainty, Explainability, Verification, Safe Reinforcement Learning, and Direct Certification. We analyzed the main trends and problems of each sub-field and provided summaries of the papers extracted. Results: The SLR results highlighted the enthusiasm of the community for this subject, as well as the lack of diversity in terms of datasets and type of models. It also emphasized the need to further develop connections between academia and industries to deepen the domain study. Finally, it also illustrated the necessity to build connections between the above mention main pillars that are for now mainly studied separately. Conclusion: We highlighted current efforts deployed to enable the certification of ML based software systems, and discuss some future research directions.
翻译:过去几年来,机器学习(ML)一直是许多创新的核心。然而,由于ML带来的范式转变彻底改变了传统认证方法,因此将机器学习(ML)纳入汽车或航空等所谓的“安全关键”系统证明非常具有挑战性。目标:本文件旨在阐明与基于ML的安全关键系统认证有关的挑战,以及文献中建议的解决这些问题的解决办法,回答“如何认证机器学习基于安全关键系统?”的问题。方法:我们开展了2015至2020年期间出版的研究文件系统文学审查,涵盖与ML系统认证有关的专题。我们总共确定了217份文件,涉及被视为ML认证主要支柱的主题:强性、不确定性、可解释性、核查、安全强化学习和直接认证。我们分析了每个子领域的主要趋势和问题,并提供了所摘录的文件摘要。结果:SLRL结果突出了该主题社区的热情,并讨论了与ML系统认证相关主题有关的议题。我们确定了217份文件,涉及ML认证的主要支柱的构建了当前研究类型和最终研究领域之间缺乏多样性。