Modern biology frequently relies on machine learning to provide predictions and improve decision processes. There have been recent calls for more scrutiny on machine learning performance and possible limitations. Here we present a set of community-wide recommendations aiming to help establish standards of supervised machine learning validation in biology. Adopting a structured methods description for machine learning based on data, optimization, model, evaluation (DOME) will aim to help both reviewers and readers to better understand and assess the performance and limitations of a method or outcome. The recommendations are formulated as questions to anyone wishing to pursue implementation of a machine learning algorithm. Answers to these questions can be easily included in the supplementary material of published papers.
翻译:现代生物学经常依靠机器学习来提供预测并改进决策程序。最近有人呼吁对机器学习的绩效和可能的局限性进行更多的审查。我们在这里提出一套全社区的建议,旨在帮助建立生物学中监督机器学习验证的标准。采用基于数据、优化、模型、评价(DOME)的机器学习结构化方法描述,目的是帮助审查者和读者更好地了解和评估某种方法或结果的绩效和局限性。这些建议是作为任何希望采用机器学习算法的人提出的问题而拟订的。对这些问题的答案很容易列入出版论文的补充材料。