We extend best-subset selection to linear Multi-Task Learning (MTL), where a set of linear models are jointly trained on a collection of datasets (``tasks''). Allowing the regression coefficients of tasks to have different sparsity patterns (i.e., different supports), we propose a modeling framework for MTL that encourages models to share information across tasks, for a given covariate, through separately 1) shrinking the coefficient supports together, and/or 2) shrinking the coefficient values together. This allows models to borrow strength during variable selection even when the coefficient values differ markedly between tasks. We express our modeling framework as a Mixed-Integer Program, and propose efficient and scalable algorithms based on block coordinate descent and combinatorial local search. We show our estimator achieves statistically optimal prediction rates. Importantly, our theory characterizes how our estimator leverages the shared support information across tasks to achieve better variable selection performance. We evaluate the performance of our method in simulations and two biology applications. Our proposed approaches outperform other sparse MTL methods in variable selection and prediction accuracy. Interestingly, penalties that shrink the supports together often outperform penalties that shrink the coefficient values together. We will release an R package implementing our methods.
翻译:我们将最佳子集选择扩展至线性多任务学习(MTL),让一组线性模型在收集数据集(“Tasks' ” )方面共同培训。允许任务回归系数具有不同的宽度模式(即不同的支持),我们提议一个MTL模型框架,鼓励模型在任务之间共享信息,一个特定的共变量,通过单独1 将系数支持的系数缩小在一起,和/或2 将系数值一起缩小。这允许模型在变量选择期间借用强度,即使系数值在任务之间差异显著。我们将我们的模型框架表述为混合 Interer 程序,并提议基于块协调底部和组合本地搜索的高效和可缩放算算法。我们展示了我们的估计数在统计上达到最佳的预测率。重要的是,我们的估算员如何利用跨任务共享的支持信息实现更好的可变性选择性业绩。我们评估了我们在模拟和两个生物学应用中的方法的性能。我们提议的模型方法优于变量选择和预测精确度中的其他稀少的MTL方法。我们提议的模型将共同支持一个可变式的模型。