This work focuses on in-context data augmentation for intent detection. Having found that augmentation via in-context prompting of large pre-trained language models (PLMs) alone does not improve performance, we introduce a novel approach based on PLMs and pointwise V-information (PVI), a metric that can measure the usefulness of a datapoint for training a model. Our method first fine-tunes a PLM on a small seed of training data and then synthesizes new datapoints - utterances that correspond to given intents. It then employs intent-aware filtering, based on PVI, to remove datapoints that are not helpful to the downstream intent classifier. Our method is thus able to leverage the expressive power of large language models to produce diverse training data. Empirical results demonstrate that our method can produce synthetic training data that achieve state-of-the-art performance on three challenging intent detection datasets under few-shot settings (1.28% absolute improvement in 5-shot and 1.18% absolute in 10-shot, on average) and perform on par with the state-of-the-art in full-shot settings (within 0.01% absolute, on average).
翻译:这项工作侧重于用于探测意图的文本内数据增强。 我们发现,仅通过经事先培训的大型语言模型(PLMs)的文本内推力而增强,并不能改善性能,因此,我们采用了基于PLM(PLM)和Printwith V-信息(PVI)的新颖方法,该方法可以测量一个模型数据点的有用性。 我们的方法首先对少量的培训数据种子进行微调PLM(PLM),然后对新的数据点进行合成(PLM)——符合特定意图的言论;然后根据PVI(PVI)进行意向性能过滤,以删除对下游目的分类器没有帮助的数据点。 因此,我们的方法能够利用大语言模型的显性能来生成多种培训数据。 经验性结果表明,我们的方法可以产生合成培训数据,在几发环境中的三个挑战性意向探测数据集上达到最先进的业绩(5发式绝对改善2.28%,10发绝对改善1.18%),并在全发环境中与最新状态(绝对数为0.01%)。