Generalized text representations are the foundation of many natural language understanding tasks. To fully utilize the different corpus, it is inevitable that models need to understand the relevance among them. However, many methods ignore the relevance and adopt a single-channel model (a coarse paradigm) directly for all tasks, which lacks enough rationality and interpretation. In addition, some existing works learn downstream tasks by stitches skill block(a fine paradigm), which might cause irrationalresults due to its redundancy and noise. Inthis work, we first analyze the task correlation through three different perspectives, i.e., data property, manual design, and model-based relevance, based on which the similar tasks are grouped together. Then, we propose a hierarchical framework with a coarse-to-fine paradigm, with the bottom level shared to all the tasks, the mid-level divided to different groups, and the top-level assigned to each of the tasks. This allows our model to learn basic language properties from all tasks, boost performance on relevant tasks, and reduce the negative impact from irrelevant tasks. Our experiments on 13 benchmark datasets across five natural language understanding tasks demonstrate the superiority of our method.
翻译:通用文本表述是许多自然语言理解任务的基础。 要充分利用不同内容, 模型不可避免地需要理解它们之间的关联性。 但是, 许多方法忽视相关性, 并对所有任务直接采用单一通道模式( 粗糙模式), 缺乏足够的理性和解释; 此外, 一些现有作品通过缝合技能块( 精细模式) 学习下游任务, 这可能会因其冗余和噪音而造成不合理的结果。 在这项工作中, 我们首先从三个不同的角度分析任务的相关性, 即数据属性、 手工设计、 和基于模型的相关性, 在此基础上将类似任务组合在一起。 然后, 我们提出一个等级框架, 以粗略至细微的范式为基础, 与所有任务共享底层, 中层划分为不同组, 以及每项任务的顶层。 这使得我们的模型能够从所有任务中学习基本语言属性, 推动相关任务的业绩, 并减少相关任务的负面影响。 我们对13项基准数据的实验, 展示了我们方法的优越性。