With the development of new sensors and monitoring devices, more sources of data become available to be used as inputs for machine learning models. These can on the one hand help to improve the accuracy of a model. On the other hand however, combining these new inputs with historical data remains a challenge that has not yet been studied in enough detail. In this work, we propose a transfer-learning algorithm that combines the new and the historical data, that is especially beneficial when the new data is scarce. We focus the approach on the linear regression case, which allows us to conduct a rigorous theoretical study on the benefits of the approach. We show that our approach is robust against negative transfer-learning, and we confirm this result empirically with real and simulated data.
翻译:随着新传感器和监测装置的开发,更多的数据来源可以用作机器学习模型的投入,一方面可以帮助提高模型的准确性;另一方面,将这些新投入与历史数据相结合仍是一项挑战,尚未对此进行足够详细的研究;在这项工作中,我们建议采用将新数据与历史数据相结合的转移-学习算法,在新数据稀少时,这种算法特别有益;我们注重线性回归案例,这使我们能够对这种方法的好处进行严格的理论研究;我们表明,我们的方法对负面转移-学习是强有力的,我们用真实和模拟的数据以经验证实这一结果。