低资源语言使用预先培训语言的本意分类 (Intent Classification Using Pre-trained Language Agnostic Embeddings For Low Resource Languages) - 专知论文

会员服务 ·

0

Performer · 讲稿 · 可辨认的 · 语音识别 · Allo ·

2022 年 4 月 18 日

Intent Classification Using Pre-trained Language Agnostic Embeddings For Low Resource Languages

翻译：低资源语言使用预先培训语言的本意分类

Hemant Yadav,Akshat Gupta,Sai Krishna Rallabandi,Alan W Black,Rajiv Ratn Shah

Building Spoken Language Understanding (SLU) systems that do not rely on language specific Automatic Speech Recognition (ASR) is an important yet less explored problem in language processing. In this paper, we present a comparative study aimed at employing a pre-trained acoustic model to perform SLU in low resource scenarios. Specifically, we use three different embeddings extracted using Allosaurus, a pre-trained universal phone decoder: (1) Phone (2) Panphone, and (3) Allo embeddings. These embeddings are then used in identifying the spoken intent. We perform experiments across three different languages: English, Sinhala, and Tamil each with different data sizes to simulate high, medium, and low resource scenarios. Our system improves on the state-of-the-art (SOTA) intent classification accuracy by approximately 2.11% for Sinhala and 7.00% for Tamil and achieves competitive results on English. Furthermore, we present a quantitative analysis of how the performance scales with the number of training examples used per intent.

翻译：建设不依赖特定语言自动语音识别(ASR)的语音理解(SLU)系统,是语言处理中一个重要的、但探索较少的问题。在本文中,我们提出一项比较研究,旨在使用预先训练的声学模型,在低资源情景下实施SLU。具体地说,我们使用三个不同的嵌入器,分别使用预先训练的通用电话解码器Allosaurus,即预先训练的通用电话解码器:(1)电话(2)Panphone,和(3)Allo嵌入器。这些嵌入器随后用于确定口语意图。我们用三种不同语言进行实验:英语、僧伽罗语和泰米尔语,每种语言的数据大小不同,以模拟高、中、低资源情景。我们的系统对Sohala(SOTA)目的分类精度做了大约2.11%的改进,对Sinhala语和泰米尔语的大约7.0%的精确度,对英语取得了竞争性结果。此外,我们用培训实例的数量分析如何进行。

0

相关内容

Performer

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

【2020新书】自然语言处理Python与spaCy实践，216页pdf，NLP with Python

【2020新书】自然语言处理Python与spaCy实践，216页pdf，NLP with Python

专知会员服务

108+阅读 · 2020年5月1日

【领域对抗学习的低资源文本分类】Low-Resource Text Classification using Domain-Adversarial Learning

【领域对抗学习的低资源文本分类】Low-Resource Text Classification using Domain-Adversarial Learning

专知会员服务

23+阅读 · 2020年4月22日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

计算机视觉最佳实践、代码示例和相关文档

计算机视觉最佳实践、代码示例和相关文档

专知会员服务

20+阅读 · 2019年10月9日

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

开放知识图谱

1+阅读 · 2022年4月4日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

KDD2021 | 最新GNN官方教程

KDD2021 | 最新GNN官方教程

机器学习与推荐算法

2+阅读 · 2021年8月18日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【论文推荐】最新八篇情感分析相关论文—Pair-wise判别器、多模态情感分析、上下文语境、Gated 卷积网络

【论文推荐】最新八篇情感分析相关论文—Pair-wise判别器、多模态情感分析、上下文语境、Gated 卷积网络

专知

20+阅读 · 2018年6月29日

【论文推荐】最新五篇命名实体识别（NER）相关论文—对抗学习、语料库、深度多任务学习、先验知识、跨语言语义

【论文推荐】最新五篇命名实体识别（NER）相关论文—对抗学习、语料库、深度多任务学习、先验知识、跨语言语义

专知

37+阅读 · 2018年2月21日

基于图的半监督学习算法研究

国家自然科学基金

5+阅读 · 2015年12月31日

水合团簇离子的从头算分子动力学研究

国家自然科学基金

0+阅读 · 2014年12月31日

去泛素化酶USP4调节SMAD4蛋白单泛素化并调控TGF-β/Activin信号的研究

国家自然科学基金

0+阅读 · 2014年12月31日

表面多孔疏水膜超薄修饰的铜基催化剂及其水相加氢性能研究

国家自然科学基金

0+阅读 · 2013年12月31日

氧化石墨烯基复合物的合成及在放射性废水处理中的吸附性能

国家自然科学基金

0+阅读 · 2013年12月31日

甲酸为替代氢源对生物油中酚类组分加氢的研究

国家自然科学基金

0+阅读 · 2012年12月31日

Au1-xM'x/3DOM MOy (M' = Pd, Pt; M = Cr, Mn, Co, Fe)的可控制备及催化CO和VOC氧化的性能研究

国家自然科学基金

0+阅读 · 2012年12月31日

石墨烯-磁性尖晶石半导体复合物的结构及其可见光催化性能

国家自然科学基金

0+阅读 · 2011年12月31日

穿膜肽Penetratin及其衍生物的解离动力学研究

国家自然科学基金

0+阅读 · 2008年12月31日

p-n复合半导体CoO/CdS敏化TiO2可见光催化分解水制氢

国家自然科学基金

0+阅读 · 2008年12月31日

Masked Unsupervised Self-training for Zero-shot Image Classification

Arxiv

0+阅读 · 2022年6月7日

Temporal Effects on Pre-trained Models for Language Processing Tasks

Temporal Effects on Pre-trained Models for Language Processing Tasks

Arxiv

0+阅读 · 2022年6月6日

Deciphering Speech: a Zero-Resource Approach to Cross-Lingual Transfer in ASR

Arxiv

0+阅读 · 2022年6月6日

Code Generation Tools (Almost) for Free? A Study of Few-Shot, Pre-Trained Language Models on Code

Arxiv

0+阅读 · 2022年6月2日

Making Pre-trained Language Models Better Few-shot Learners

Arxiv

14+阅读 · 2020年12月31日

Learning to Learn and Predict: A Meta-Learning Approach for Multi-Label Classification

Learning to Learn and Predict: A Meta-Learning Approach for Multi-Label Classification

Arxiv

17+阅读 · 2019年9月9日

Latent Relation Language Models

Arxiv

21+阅读 · 2019年8月21日

How to Fine-Tune BERT for Text Classification?

How to Fine-Tune BERT for Text Classification?

Arxiv

13+阅读 · 2019年5月14日

BERT for Joint Intent Classification and Slot Filling

Arxiv

12+阅读 · 2019年2月28日

Graph Convolutional Networks for Text Classification

Arxiv

12+阅读 · 2018年9月15日

VIP会员

文章信息

相关主题

相关VIP内容

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

【2020新书】自然语言处理Python与spaCy实践，216页pdf，NLP with Python

【2020新书】自然语言处理Python与spaCy实践，216页pdf，NLP with Python

专知会员服务

108+阅读 · 2020年5月1日

【领域对抗学习的低资源文本分类】Low-Resource Text Classification using Domain-Adversarial Learning

【领域对抗学习的低资源文本分类】Low-Resource Text Classification using Domain-Adversarial Learning

专知会员服务

23+阅读 · 2020年4月22日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

计算机视觉最佳实践、代码示例和相关文档

计算机视觉最佳实践、代码示例和相关文档

专知会员服务

20+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《乌克兰无人机产业：志愿者与政策在构建新兴无人机产业中的协同作用》最新报告

《人工智能辅助决策中的数据可视化：系统性综述》

人工智能驱动弹药制造现代化：美国陆军转型之路

《敏捷作战部署中枢纽-辐条基地选址优化研究》80页

相关资讯

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

开放知识图谱

1+阅读 · 2022年4月4日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

KDD2021 | 最新GNN官方教程

KDD2021 | 最新GNN官方教程

机器学习与推荐算法

2+阅读 · 2021年8月18日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【论文推荐】最新八篇情感分析相关论文—Pair-wise判别器、多模态情感分析、上下文语境、Gated 卷积网络

【论文推荐】最新八篇情感分析相关论文—Pair-wise判别器、多模态情感分析、上下文语境、Gated 卷积网络

专知

20+阅读 · 2018年6月29日

【论文推荐】最新五篇命名实体识别（NER）相关论文—对抗学习、语料库、深度多任务学习、先验知识、跨语言语义

【论文推荐】最新五篇命名实体识别（NER）相关论文—对抗学习、语料库、深度多任务学习、先验知识、跨语言语义

专知

37+阅读 · 2018年2月21日

相关论文

Masked Unsupervised Self-training for Zero-shot Image Classification

Arxiv

0+阅读 · 2022年6月7日

Temporal Effects on Pre-trained Models for Language Processing Tasks

Temporal Effects on Pre-trained Models for Language Processing Tasks

Arxiv

0+阅读 · 2022年6月6日

Deciphering Speech: a Zero-Resource Approach to Cross-Lingual Transfer in ASR

Arxiv

0+阅读 · 2022年6月6日

Code Generation Tools (Almost) for Free? A Study of Few-Shot, Pre-Trained Language Models on Code

Arxiv

0+阅读 · 2022年6月2日

Making Pre-trained Language Models Better Few-shot Learners

Arxiv

14+阅读 · 2020年12月31日

Learning to Learn and Predict: A Meta-Learning Approach for Multi-Label Classification

Learning to Learn and Predict: A Meta-Learning Approach for Multi-Label Classification

Arxiv

17+阅读 · 2019年9月9日

Latent Relation Language Models

Arxiv

21+阅读 · 2019年8月21日

How to Fine-Tune BERT for Text Classification?

How to Fine-Tune BERT for Text Classification?

Arxiv

13+阅读 · 2019年5月14日

BERT for Joint Intent Classification and Slot Filling

Arxiv

12+阅读 · 2019年2月28日

Graph Convolutional Networks for Text Classification

Arxiv

12+阅读 · 2018年9月15日

相关基金

基于图的半监督学习算法研究

国家自然科学基金

5+阅读 · 2015年12月31日

水合团簇离子的从头算分子动力学研究

国家自然科学基金

0+阅读 · 2014年12月31日

去泛素化酶USP4调节SMAD4蛋白单泛素化并调控TGF-β/Activin信号的研究

国家自然科学基金

0+阅读 · 2014年12月31日

表面多孔疏水膜超薄修饰的铜基催化剂及其水相加氢性能研究

国家自然科学基金

0+阅读 · 2013年12月31日

氧化石墨烯基复合物的合成及在放射性废水处理中的吸附性能

国家自然科学基金

0+阅读 · 2013年12月31日

甲酸为替代氢源对生物油中酚类组分加氢的研究

国家自然科学基金

0+阅读 · 2012年12月31日

Au1-xM'x/3DOM MOy (M' = Pd, Pt; M = Cr, Mn, Co, Fe)的可控制备及催化CO和VOC氧化的性能研究

国家自然科学基金

0+阅读 · 2012年12月31日

石墨烯-磁性尖晶石半导体复合物的结构及其可见光催化性能

国家自然科学基金

0+阅读 · 2011年12月31日

穿膜肽Penetratin及其衍生物的解离动力学研究

国家自然科学基金

0+阅读 · 2008年12月31日

p-n复合半导体CoO/CdS敏化TiO2可见光催化分解水制氢

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员