We propose a new method for supervised learning with multiple sets of features ("views"). Cooperative learning combines the usual squared error loss of predictions with an "agreement" penalty to encourage the predictions from different data views to agree. By varying the weight of the agreement penalty, we get a continuum of solutions that include the well-known early and late fusion approaches. Cooperative learning chooses the degree of agreement (or fusion) in an adaptive manner, using a validation set or cross-validation to estimate test set prediction error. One version of our fitting procedure is modular, where one can choose different fitting mechanisms (e.g. lasso, random forests, boosting, neural networks) appropriate for different data views. In the setting of cooperative regularized linear regression, the method combines the lasso penalty with the agreement penalty. The method can be especially powerful when the different data views share some underlying relationship in their signals that we aim to strengthen, while each view has its idiosyncratic noise that we aim to reduce. We illustrate the effectiveness of our proposed method on simulated and real data examples.
翻译:合作学习将通常的平方差错预测损失与“协议”处罚相结合,以鼓励不同数据观点的预测达成一致。通过改变协议处罚的权重,我们得到了一系列的解决方案,其中包括众所周知的早期和晚期聚合方法。合作学习以适应的方式选择协议(或聚合)的程度,使用验证集或交叉校准来估计测试集预测错误。我们的适当程序的一个版本是模块化,可以选择适合不同数据观点的不同安装机制(如 lasso、 随机森林、 推进、 神经网络)。在合作性常规线性回归的设置中,该方法将拉索处罚与协议处罚相结合。当不同的数据观点在我们想要加强的信号中分享某种基本关系时,该方法可能特别强大,而每种观点都有我们想要减少的特殊性噪音。我们用模拟和真实数据示例来说明我们拟议方法的有效性。