Reproducibility is one of the core dimensions that concur to deliver Trustworthy Artificial Intelligence. Broadly speaking, reproducibility can be defined as the possibility to reproduce the same or a similar experiment or method, thereby obtaining the same or similar results as the original scientists. It is an essential ingredient of the scientific method and crucial for gaining trust in relevant claims. A reproducibility crisis has been recently acknowledged by scientists and this seems to affect even more Artificial Intelligence and Machine Learning, due to the complexity of the models at the core of their recent successes. Notwithstanding the recent debate on Artificial Intelligence reproducibility, its practical implementation is still insufficient, also because many technical issues are overlooked. In this survey, we critically review the current literature on the topic and highlight the open issues. Our contribution is three-fold. We propose a concise terminological review of the terms coming into play. We collect and systematize existing recommendations for achieving reproducibility, putting forth the means to comply with them. We identify key elements often overlooked in modern Machine Learning and provide novel recommendations for them. We further specialize these for two critical application domains, namely the biomedical and physical artificial intelligence fields.
翻译:复制是同意提供可信人工智能的核心层面之一。广义而言,复制可以被定义为复制同样或类似的实验或方法的可能性,从而获得与原始科学家相同或类似的结果。这是科学方法的一个基本组成部分,对于获得相关主张的信任至关重要。最近科学家已经承认了再复制危机,这似乎影响到更多的人工智能和机器学习,因为这些模型是其最近成功的核心,其复杂性更大。尽管最近关于人工智能复制的辩论,但其实际实施仍然不足,还因为许多技术问题被忽视。在这次调查中,我们批判性地审查目前关于该主题的文献,并突出尚未解决的问题。我们的意见有三重。我们提议对即将起作用的术语进行简要的术语审查。我们收集和整理现有的建议,以实现再复制,提出执行这些建议的手段。我们查明了现代机器学习中经常被忽视的关键要素,并为它们提出新的建议。我们进一步将这两个关键应用领域,即生物医学和物理人造智能领域,专门列出这两个领域。</s>