Relevant and high-quality data are critical to successful development of machine learning applications. For machine learning applications on dynamic systems equipped with a large number of sensors, such as connected vehicles and robots, how to find relevant and high-quality data features in an efficient way is a challenging problem. In this work, we address the problem of feature selection in constrained continuous data acquisition. We propose a feedback-based dynamic feature selection algorithm that efficiently decides on the feature set for data collection from a dynamic system in a step-wise manner. We formulate the sequential feature selection procedure as a Markov Decision Process. The machine learning model performance feedback with an exploration component is used as the reward function in an $\epsilon$-greedy action selection. Our evaluation shows that the proposed feedback-based feature selection algorithm has superior performance over constrained baseline methods and matching performance with unconstrained baseline methods.
翻译:相关和高质量的数据是成功开发机器学习应用程序的关键。对于配备大量传感器的动态系统中的机器学习应用程序,如连接车辆和机器人,一个具有挑战性的问题是如何以有效的方式找到相关和高质量的数据特征。在这项工作中,我们处理限制连续获取数据时的特征选择问题。我们建议采用基于反馈的动态特征选择算法,以循序渐进的方式有效决定动态系统数据收集的特征集。我们把连续特征选择程序作为Markov 决策程序。机器学习模式中带有勘探部分的绩效反馈,在以美元为单位的Greedy行动选择中用作奖励功能。我们的评估表明,基于反馈的特征选择算法的性优于受限制的基准方法,并将业绩与不受限制的基线方法相匹配。