Learning to construct text representations in end-to-end systems can be difficult, as natural languages are highly compositional and task-specific annotated datasets are often limited in size. Methods for directly supervising language composition can allow us to guide the models based on existing knowledge, regularizing them towards more robust and interpretable representations. In this paper, we investigate how objectives at different granularities can be used to learn better language representations and we propose an architecture for jointly learning to label sentences and tokens. The predictions at each level are combined together using an attention mechanism, with token-level labels also acting as explicit supervision for composing sentence-level representations. Our experiments show that by learning to perform these tasks jointly on multiple levels, the model achieves substantial improvements for both sentence classification and sequence labeling.
翻译:直接监督语言构成的方法可以指导基于现有知识的模型,使其正规化,成为更健全和可解释的模型。在本文件中,我们调查如何利用不同颗粒的目标来学习更好的语言表述,并提议一个共同学习标记句子和标语的结构。每个级别的预测都使用关注机制,同时使用象征性等级标签作为明确监督,以组成判决级标识。我们的实验表明,通过学习如何在多个级别上共同执行这些任务,该模型在句子分类和顺序标签方面都取得了重大改进。