Skit-S2I:印度对意图数据集的鼓励演讲 (Skit-S2I: An Indian Accented Speech to Intent dataset)

Conventional conversation assistants extract text transcripts from the speech signal using automatic speech recognition (ASR) and then predict intent from the transcriptions. Using end-to-end spoken language understanding (SLU), the intents of the speaker are predicted directly from the speech signal without requiring intermediate text transcripts. As a result, the model can optimize directly for intent classification and avoid cascading errors from ASR. The end-to-end SLU system also helps in reducing the latency of the intent prediction model. Although many datasets are available publicly for text-to-intent tasks, the availability of labeled speech-to-intent datasets is limited, and there are no datasets available in the Indian accent. In this paper, we release the Skit-S2I dataset, the first publicly available Indian-accented SLU dataset in the banking domain in a conversational tonality. We experiment with multiple baselines, compare different pretrained speech encoder's representations, and find that SSL pretrained representations perform slightly better than ASR pretrained representations lacking prosodic features for speech-to-intent classification. The dataset and baseline code is available at \url{https://github.com/skit-ai/speech-to-intent-dataset}

翻译：常规对话助理用自动语音识别(ASR)从语音信号中提取文字记录誊本,然后从抄录中预测意向。使用端到端口语理解(SLU),发言者的意图直接从语音信号中预测,而不需要中间文本记录。因此,该模型可以直接优化意图分类,避免ASR的分层错误。端到端 SLU系统还有助于减少意图预测模型的延迟性。虽然许多数据集可供公开用于文本到意向任务,但标签的语音到意向数据集有限,印度口音中没有提供数据集。在本文中,我们发布了Skit-S2I数据集,这是银行领域第一个公开提供的印度文的SLU数据集。我们试验了多个基线,比较了不同的经过事先训练的语音编码,发现SLS在缺乏语音到端口语/方数据分类的预培训演示中比ASR预选的表示略好一点。数据设置基线和代码是可用的。

相关内容

数据集

关注 88

数据集，又称为资料集、数据集合或资料集合，是一种由数据所组成的集合。
Data set（或dataset）是一个数据的集合，通常以表格形式出现。每一列代表一个特定变量。每一行都对应于某一成员的数据集的问题。它列出的价值观为每一个变量，如身高和体重的一个物体或价值的随机数。每个数值被称为数据资料。对应于行数，该数据集的数据可能包括一个或多个成员。

【教程】深度学习Keras与TensorFlow教程，Deep Learning with Keras and Tensorflow in R

专知会员服务

32+阅读 · 2022年3月9日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日