反面犯罪:对机械学习算法进行认真培训可能导致过于乐观的结果 (Subtle Inverse Crimes: Naïvely training machine learning algorithms could lead to overly-optimistic results)

While open databases are an important resource in the Deep Learning (DL) era, they are sometimes used "off-label": data published for one task are used for training algorithms for a different one. This work aims to highlight that in some cases, this common practice may lead to biased, overly-optimistic results. We demonstrate this phenomenon for inverse problem solvers and show how their biased performance stems from hidden data preprocessing pipelines. We describe two preprocessing pipelines typical of open-access databases and study their effects on three well-established algorithms developed for Magnetic Resonance Imaging (MRI) reconstruction: Compressed Sensing (CS), Dictionary Learning (DictL), and DL. In this large-scale study we performed extensive computations. Our results demonstrate that the CS, DictL and DL algorithms yield systematically biased results when na\"ively trained on seemingly-appropriate data: the Normalized Root Mean Square Error (NRMSE) improves consistently with the preprocessing extent, showing an artificial increase of 25%-48% in some cases. Since this phenomenon is generally unknown, biased results are sometimes published as state-of-the-art; we refer to that as subtle inverse crimes. This work hence raises a red flag regarding na\"ive off-label usage of Big Data and reveals the vulnerability of modern inverse problem solvers to the resulting bias.

翻译：虽然开放数据库是深层学习(DL)时代的一个重要资源,但有时使用“关闭标签”:为一项任务公布的数据被用于不同任务的培训算法。这项工作旨在强调, 在某些情况下, 这种常见做法可能导致偏向, 过于乐观的结果。我们为反问题解答者展示了这个现象, 并展示了它们有偏差的性能如何产生于隐藏的数据处理前管道。我们描述了两个以开放访问数据库为典型的预处理管道, 并研究了它们对为磁共振成像(MRI)重建(MRI)开发的三个完善的算法的影响: 压缩感测(CS)、词典学习(DictL) 和 DL 。在本次大规模研究中, 我们进行了广泛的计算。我们的结果显示, CS、 DictL 和 DL 算法在进行关于貌似适当的数据的训练时会产生系统性的偏差结果。我们描述的是, 正常的原始平方差错误(NRMSE) 与预处理程度一致地改善了它们的效果, 在某些案例中显示人为增加了25- 48 % 。由于这种偏差值的结果, 因此, 我们在国旗上的偏差值上显示, 显示, 导致的精确的变差值数据显示, 。