Discovering out-of-domain (OOD) intent is important for developing new skills in task-oriented dialogue systems. The key challenges lie in how to transfer prior in-domain (IND) knowledge to OOD clustering, as well as jointly learn OOD representations and cluster assignments. Previous methods suffer from in-domain overfitting problem, and there is a natural gap between representation learning and clustering objectives. In this paper, we propose a unified K-nearest neighbor contrastive learning framework to discover OOD intents. Specifically, for IND pre-training stage, we propose a KCL objective to learn inter-class discriminative features, while maintaining intra-class diversity, which alleviates the in-domain overfitting problem. For OOD clustering stage, we propose a KCC method to form compact clusters by mining true hard negative samples, which bridges the gap between clustering and representation learning. Extensive experiments on three benchmark datasets show that our method achieves substantial improvements over the state-of-the-art methods.
翻译:发现外在域(OOD)意图对于开发任务导向对话系统的新技能十分重要,关键的挑战在于如何将原部内(IND)知识转移给OOOD集群,以及共同学习OOD的表述和集群任务。以前的方法在内部存在问题,在代表性学习和集群目标之间存在自然差距。在本文件中,我们提议一个统一的K-最近邻对比学习框架,以发现OOOD意图。具体地说,在IND培训前阶段,我们提出KCL目标,以学习阶级间歧视特征,同时保持阶级内多样性,从而缓解内部过度适应的问题。关于OOD集群阶段,我们提议一种KCC方法,通过挖掘真正的硬性负抽样来形成集束群,从而弥合集群与代表学习之间的差距。关于三个基准数据集的广泛实验表明,我们的方法在最新方法上取得了重大改进。