机器学习人员流动情况调查 (A Survey of Human-in-the-loop for Machine Learning)

Human-in-the-loop aims to train an accurate prediction model with minimum cost by integrating human knowledge and experience. Humans can provide training data for machine learning applications and directly accomplish some tasks that are hard for computers in the pipeline with the help of machine-based approaches. In this paper, we survey existing works on human-in-the-loop from a data perspective and classify them into three categories with a progressive relationship: (1) the work of improving model performance from data processing, (2) the work of improving model performance through interventional model training, and (3) the design of the system independent human-in-the-loop. Using the above categorization, we summarize major approaches in the field, along with their technical strengths/ weaknesses, we have simple classification and discussion in natural language processing, computer vision, and others. Besides, we provide some open challenges and opportunities. This survey intends to provide a high-level summarization for human-in-the-loop and motivates interested readers to consider approaches for designing effective human-in-the-loop solutions.

翻译：在本文中,我们从数据角度调查目前关于流动人员的工作,将其分为三个渐进关系类别:(1) 改进数据处理工作模式的工作,(2) 通过干预模式培训改进模型性能的工作,(3) 通过干预模式培训改进模型性能的工作,(3) 设计独立的流动人员系统。利用上述分类,我们总结了该领域的主要方法及其技术优势/弱点,我们在自然语言处理、计算机愿景和其他方面有简单的分类和讨论。此外,我们提供一些公开的挑战和机遇。这项调查旨在为流动人员提供高层次的组合,激励感兴趣的读者考虑设计有效的流动人员解决方案的方法。