Existing task-oriented chatbots heavily rely on spoken language understanding (SLU) systems to determine a user's utterance's intent and other key information for fulfilling specific tasks. In real-life applications, it is crucial to occasionally induce novel dialog intents from the conversation logs to improve the user experience. In this paper, we propose the Density-based Deep Clustering Ensemble (DDCE) method for dialog intent induction. Compared to existing K-means based methods, our proposed method is more effective in dealing with real-life scenarios where a large number of outliers exist. To maximize data utilization, we jointly optimize texts' representations and the hyperparameters of the clustering algorithm. In addition, we design an outlier-aware clustering ensemble framework to handle the overfitting issue. Experimental results over seven datasets show that our proposed method significantly outperforms other state-of-the-art baselines.
翻译:现有的以任务为导向的聊天室非常依赖口语理解系统来确定用户的意向和其他关键信息,以完成具体任务。 在现实应用中,偶尔从对话日志中引入新的对话意图以改善用户经验至关重要。 在本文中,我们建议了基于密度的深聚组(DCDCE)对话意图诱导方法。与基于K手段的现有方法相比,我们建议的方法在应对存在大量外部关系的实际生活情景方面更为有效。为了最大限度地利用数据,我们联合优化文本的表达方式和组合算法的超参数。此外,我们设计了一个外能聚合组合组合组合组合组合组合的组合框架来处理过于合适的问题。七个数据集的实验结果显示,我们拟议的方法大大超越了其他最先进的基线。