Multi-Task Learning (MTL) has achieved success in various fields. However, how to balance different tasks to achieve good performance is a key problem. To achieve the task balancing, there are many works to carefully design dynamical loss/gradient weighting strategies but the basic random experiments are ignored to examine their effectiveness. In this paper, we propose the Random Weighting (RW) methods, including Random Loss Weighting (RLW) and Random Gradient Weighting (RGW), where an MTL model is trained with random loss/gradient weights sampled from a distribution. To show the effectiveness and necessity of RW methods, theoretically we analyze the convergence of RW and reveal that RW has a higher probability to escape local minima, resulting in better generalization ability. Empirically, we extensively evaluate the proposed RW methods to compare with twelve state-of-the-art methods on five image datasets and two multilingual problems from the XTREME benchmark to show RW methods can achieve comparable performance with state-of-the-art baselines. Therefore, we think that the RW methods are important baselines for MTL and should attract more attentions.
翻译:多任务学习(MTL)在各个领域都取得了成功。然而,如何平衡不同任务以取得良好业绩是一个关键问题。为了实现任务平衡,有许多工作是仔细设计动态损益/梯度加权战略,但基本随机试验被忽略以检查其有效性。在本文件中,我们提议随机加权方法,包括随机失重(RLW)和随机梯度加权(RGW)方法,即对MTL模型进行随机失重/梯度加权培训,从分布中抽取样本。为了显示RW方法的有效性和必要性,理论上我们分析RW的趋同,并表明RW更有可能逃离当地微型模型,从而提高一般化能力。在本文中,我们广泛评价拟议的RW方法,以便与5个图像数据集的12种最先进的方法进行比较,以及XTREME基准中的两个多语言问题,以显示RW方法可以达到与国家基准相当的业绩。因此,我们认为,RW方法是MWTL的重要基准,应该吸引更多的注意。