Machine learning (ML) has become a vital part in many aspects of our daily life. However, building well performing machine learning applications requires highly specialized data scientists and domain experts. Automated machine learning (AutoML) aims to reduce the demand for data scientists by enabling domain experts to build machine learning applications automatically without extensive knowledge of statistics and machine learning. This paper is a combination of a survey on current AutoML methods and a benchmark of popular AutoML frameworks on real data sets. Driven by the selected frameworks for evaluation, we summarize and review important AutoML techniques and methods concerning every step in building an ML pipeline. The selected AutoML frameworks are evaluated on 137 data sets from established AutoML benchmark suits.
翻译:机械学习(ML)已成为我们日常生活许多方面的一个重要部分,然而,建立运作良好的机器学习应用程序需要高度专业化的数据科学家和领域专家。自动机学习(Automal)的目的是通过使域专家能够在不掌握广泛的统计和机器学习知识的情况下自动建立机器学习应用程序,从而减少对数据科学家的需求。本文是当前自动学习方法的调查和关于实际数据集的流行自动学习框架基准的结合。在选定的评价框架的推动下,我们总结和审查关于建立ML管道每一步骤的重要自动学习技术和方法。选定的自动学习框架是根据既定自动ML基准诉讼的137个数据集进行评估的。