An intelligent virtual assistant (IVA) enables effortless conversations in call routing through spoken utterance classification (SUC) which is a special form of spoken language understanding (SLU). Building a SUC system requires a large amount of supervised in-domain data that is not always available. In this paper, we introduce an unsupervised spoken utterance classification approach (USUC) that does not require any in-domain data except for the intent labels and a few para-phrases per intent. USUC is consisting of a KNN classifier (K=1) and a complex embedding model trained on a large amount of unsupervised customer service corpus. Among all embedding models, we demonstrate that Elmo works best for USUC. However, an Elmo model is too slow to be used at run-time for call routing. To resolve this issue, first, we compute the uni- and bi-gram embedding vectors offline and we build a lookup table of n-grams and their corresponding embedding vector. Then we use this table to compute sentence embedding vectors at run-time, along with back-off techniques for unseen n-grams. Experiments show that USUC outperforms the traditional utterance classification methods by reducing the classification error rate from 32.9% to 27.0% without requiring supervised data. Moreover, our lookup and back-off technique increases the processing speed from 16 utterances per second to 118 utterances per second.
翻译:智能虚拟助理( IVA) 智能虚拟助理( IVA) 能够通过口语语音分类( SUC) 进行不劳而获的调用路由对话, 这是口语理解的一种特殊形式 。 建立 SUC 系统需要大量监管的内域数据, 这些数据并非总有。 在本文中, 我们引入了一种不受监督的口语语音分类( UUUC) 方法, 除了意向标签和每个意图的几段段落外, 不需要任何内域内数据。 USUC 是一个 KNNN 分类器( K=1) 和一个复杂的嵌入模型, 用于大量不受监督的客户服务库。 在全部嵌入模型中, 我们证明 Elmo 系统对USUC 最有效。 然而, Elmo 模式太慢, 无法在调用前期使用。 为了解决这个问题, 首先, 我们将单项和双项嵌入矢量的矢量嵌入到线上。 然后, 我们用这张表格来从前期( 27) 嵌入) 至后端( NLEVIA) 级分类, 需要32 的缩缩算方法。