This document is a concise outline of some of the common mistakes that occur when using machine learning, and what can be done to avoid them. Whilst it should be accessible to anyone with a basic understanding of machine learning techniques, it was originally written for research students, and focuses on issues that are of particular concern within academic research, such as the need to do rigorous comparisons and reach valid conclusions. It covers five stages of the machine learning process: what to do before model building, how to reliably build models, how to robustly evaluate models, how to compare models fairly, and how to report results.
翻译:本文件简要概述了在使用机器学习时发生的一些常见错误,以及可以采取什么措施来避免这些错误。虽然任何人只要对机器学习技术有基本了解,就应该能够了解这些错误,但该文件最初是为研究学生编写的,侧重于学术研究中特别关注的问题,例如需要进行严格的比较和得出有效的结论。它涵盖了机器学习过程的五个阶段:在建立模型之前做什么,如何可靠地建立模型,如何强有力地评估模型,如何公平地比较模型,如何报告结果。