Open intent classification is a practical yet challenging task in dialogue systems. Its objective is to accurately classify samples of known intents while at the same time detecting those of open (unknown) intents. Existing methods usually use outlier detection algorithms combined with K-class classifier to detect open intents, where K represents the class number of known intents. Different from them, in this paper, we consider another way without using outlier detection algorithms. Specifically, we directly train a (K+1)-class classifier for open intent classification, where the (K+1)-th class represents open intents. To address the challenge that training a (K+1)-class classifier with training samples of only K classes, we propose a deep model based on Soft Labeling and Manifold Mixup (SLMM). In our method, soft labeling is used to reshape the label distribution of the known intent samples, aiming at reducing model's overconfident on known intents. Manifold mixup is used to generate pseudo samples for open intents, aiming at well optimizing the decision boundary of open intents. Experiments on four benchmark datasets demonstrate that our method outperforms previous methods and achieves state-of-the-art performance. All the code and data of this work can be obtained at https://github.com/zifengcheng/SLMM.
翻译:开放意图分类是对话系统中一项实际但具有挑战性的任务。 它的目标是精确地分类已知意图的样本, 同时检测公开( 未知) 意图的样本。 现有方法通常使用与 K 类分类器结合的外部检测算法来检测公开意图, K 代表已知意图的类数。 在本文中, 我们考虑另一种方法时, 不使用外部检测算法。 具体地说, 我们直接培训一个( K+1) 类分类器来进行公开意图分类, 其中( K+1) - 第级代表开放意图。 为了应对培训一个( K+1) 级分类器只使用 K类培训样本的挑战, 我们建议使用一个基于 Soft Labeling 和 Manifold Mixup (SLM) 的深度模型来检测公开意图。 在我们的方法中, 软标签用于重塑已知意图样本的标签分布, 目的是减少模型对已知意图的过度自信。 Manicide M 混和 用于生成公开意图的假样本, 目的是很好地优化开放意图的决定范围。 在四个基准数据设置上, 实验中, 能够实现我们先前的数据格式的状态方法 。 。