Neural text generation models are likely to suffer from the low-diversity problem. Various decoding strategies and training-based methods have been proposed to promote diversity only by exploiting contextual features, but rarely do they consider incorporating syntactic structure clues. In this work, we propose using linguistic annotation, i.e., part-of-speech (POS), to guide the text generation. In detail, we introduce POS Guided Softmax to explicitly model two posterior probabilities: (i) next-POS, and (ii) next-token from the vocabulary of the target POS. A POS Guided Sampling strategy is further proposed to address the low-diversity problem by enriching the diversity of POS. Extensive experiments and human evaluations show that, compared with existing state-of-the-art methods, our POS Guided Softmax and Sampling (POSG) can generate more diverse text while maintaining comparable quality.
翻译:在这项工作中,我们建议使用语言注释,即部分语音(POS)来指导文本的生成。详细来说,我们采用POS向导软体(POS)来明确模拟两种后代可能性:(一) 下一个POS,和(二) 从目标POS的词汇中取下第二位。还进一步建议采用POS向导抽样战略,通过丰富POS的多样性来解决低多样性问题。广泛的实验和人类评估表明,与现有的最新技术相比,我们的POS向导软体和取样(POS)可以产生更多样化的文本,同时保持可比的质量。