Copy mechanisms explicitly obtain unchanged tokens from the source (input) sequence to generate the target (output) sequence under the neural seq2seq framework. However, most of the existing copy mechanisms only consider single word copying from the source sentences, which results in losing essential tokens while copying long spans. In this work, we propose a plug-and-play architecture, namely BioCopy, to alleviate the problem aforementioned. Specifically, in the training stage, we construct a BIO tag for each token and train the original model with BIO tags jointly. In the inference stage, the model will firstly predict the BIO tag at each time step, then conduct different mask strategies based on the predicted BIO label to diminish the scope of the probability distributions over the vocabulary list. Experimental results on two separate generative tasks show that they all outperform the baseline models by adding our BioCopy to the original model structure.
翻译:复制机制从源( 输入) 序列中明确获得不变的符号, 以生成神经元子( 输出) 框架下的目标( 输出) 序列 。 但是, 大多数现有复制机制只考虑从源句中复制单字, 从而在复制长长的长度中丢失基本符号 。 在这项工作中, 我们提议了一个插插图和游戏结构, 即 BioCopy, 以缓解上述问题 。 具体地说, 在培训阶段, 我们为每个符号建立一个 BIO 标记, 并用 BIO 标记联合培训原始模型 。 在推断阶段, 该模型将首先预测每个步骤的 BIO 标记, 然后根据预测 BIO 标签执行不同的掩码策略, 以缩小词汇列表的概率分布范围 。 两项不同的基因化任务实验结果显示, 它们都比基准模型更符合原始模型 。