Inspired by recent work in meta-learning and generative teaching networks, we propose a framework called Generative Conversational Networks, in which conversational agents learn to generate their own labelled training data (given some seed data) and then train themselves from that data to perform a given task. We use reinforcement learning to optimize the data generation process where the reward signal is the agent's performance on the task. The task can be any language-related task, from intent detection to full task-oriented conversations. In this work, we show that our approach is able to generalise from seed data and performs well in limited data and limited computation settings, with significant gains for intent detection and slot tagging across multiple datasets: ATIS, TOD, SNIPS, and Restaurants8k. We show an average improvement of 35% in intent detection and 21% in slot tagging over a baseline model trained from the seed data. We also conduct an analysis of the novelty of the generated data and provide generated examples for intent detection, slot tagging, and non-goal oriented conversations.
翻译:受元学习和基因教学网络最近工作启发,我们提议了一个称为 " 创造对话网络 " 的框架,在其中,对话者学会制作他们自己的有标签的培训数据(提供一些种子数据),然后用这些数据培训自己来完成既定任务。我们利用强化学习优化数据生成过程,奖励信号就是代理人在任务上的表现。任务可以是任何与语言有关的任务,从探测意图到全面任务性对话。在这项工作中,我们表明我们的方法能够从种子数据中归纳出来,在有限的数据和有限的计算环境中运行良好,在多个数据集(ATIS、TOD、SNIPS和Stise8k)的意向探测和位置标记方面取得重大进展。我们显示,在意图探测方面平均提高了35%,在根据种子数据训练的基线模型标记上平均改进了21%。我们还分析了生成的数据的新特点,并为意图探测、位置标记和非目标性对话提供了范例。