自然辅助语言语言评价的自然辅助语言文本变异 (NATURE: Natural Auxiliary Text Utterances for Realistic Spoken Language Evaluation) - 专知论文

会员服务 ·

0

SimPLe · 情景 · 会话智能体 · Performer · state-of-the-art ·

2021 年 11 月 9 日

NATURE: Natural Auxiliary Text Utterances for Realistic Spoken Language Evaluation

翻译：自然辅助语言语言评价的自然辅助语言文本变异

David Alfonso-Hermelo,Ahmad Rashid,Abbas Ghaddar,Philippe Langlais,Mehdi Rezagholizadeh

from arxiv, 20 pages, 4 figures, accepted to NeurIPS 2021 Track Datasets and Benchmarks

Slot-filling and intent detection are the backbone of conversational agents such as voice assistants, and are active areas of research. Even though state-of-the-art techniques on publicly available benchmarks show impressive performance, their ability to generalize to realistic scenarios is yet to be demonstrated. In this work, we present NATURE, a set of simple spoken-language oriented transformations, applied to the evaluation set of datasets, to introduce human spoken language variations while preserving the semantics of an utterance. We apply NATURE to common slot-filling and intent detection benchmarks and demonstrate that simple perturbations from the standard evaluation set by NATURE can deteriorate model performance significantly. Through our experiments we demonstrate that when NATURE operators are applied to evaluation set of popular benchmarks the model accuracy can drop by up to 40%.

翻译：信箱填充和意图探测是语音助理等谈话代理人的骨干,也是活跃的研究领域。尽管在公开的基准上最先进的技术表现出令人印象深刻的业绩,但是它们是否有能力概括现实的情景尚有待证明。在这项工作中,我们展示了一套简单的面向语言的变异,适用于数据集的评价组,以引入人的语言变异,同时保留语义的语义。我们将天体图应用于通用的空档填充和意图探测基准,并表明从NATURE的标准评估中简单的扰动可以显著地恶化模型性能。我们通过实验证明,当用天体图操作者来评价流行基准时,模型精确度可以下降40%。

0

相关内容

SimPLe

中国数据要素市场发展报告（2020~2021），65页pdf

中国数据要素市场发展报告（2020~2021），65页pdf

专知会员服务

142+阅读 · 2021年5月11日

自然语言处理中的注意力机制，Attention in Natural Language Processing

自然语言处理中的注意力机制，Attention in Natural Language Processing

专知会员服务

136+阅读 · 2020年5月30日

【清华大学】面向任务的对话系统的最新进展和挑战，Task-oriented Dialog System

【清华大学】面向任务的对话系统的最新进展和挑战，Task-oriented Dialog System

专知会员服务

84+阅读 · 2020年3月24日

深度学习自然语言处理综述论文，Natural Language Processing Advancements By Deep Learning: A Survey

深度学习自然语言处理综述论文，Natural Language Processing Advancements By Deep Learning: A Survey

专知会员服务

80+阅读 · 2020年3月5日

【微软雷德蒙研究院】小样本自然语言生成，Few-shot Natural Language Generation for Task-Oriented Dialog

【微软雷德蒙研究院】小样本自然语言生成，Few-shot Natural Language Generation for Task-Oriented Dialog

专知会员服务

33+阅读 · 2020年2月29日

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

10+阅读 · 2019年10月24日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

已删除

将门创投

3+阅读 · 2019年6月12日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

Nature 一周论文导读 | 2018 年 5 月 24 日

Nature 一周论文导读 | 2018 年 5 月 24 日

科研圈

11+阅读 · 2018年5月27日

推荐｜深度强化学习聊天机器人（附论文）！

推荐｜深度强化学习聊天机器人（附论文）！

全球人工智能

4+阅读 · 2018年1月30日

计算机视觉近一年进展综述

计算机视觉近一年进展综述

机器学习研究会

9+阅读 · 2017年11月25日

【推荐】用Python/OpenCV实现增强现实

【推荐】用Python/OpenCV实现增强现实

机器学习研究会

15+阅读 · 2017年11月16日

Text is no more Enough! A Benchmark for Profile-based Spoken Language Understanding

Arxiv

0+阅读 · 2022年1月12日

How Does Data Corruption Affect Natural Language Understanding Models? A Study on GLUE datasets

How Does Data Corruption Affect Natural Language Understanding Models? A Study on GLUE datasets

Arxiv

0+阅读 · 2022年1月12日

Text Detection and Recognition in the Wild: A Review

Arxiv

20+阅读 · 2020年6月8日

Reverse Engineering Configurations of Neural Text Generation Models

Arxiv

5+阅读 · 2020年4月13日

Few-shot Natural Language Generation for Task-Oriented Dialog

Few-shot Natural Language Generation for Task-Oriented Dialog

Arxiv

30+阅读 · 2020年2月27日

Predictive Engagement: An Efficient Metric For Automatic Evaluation of Open-Domain Dialogue Systems

Predictive Engagement: An Efficient Metric For Automatic Evaluation of Open-Domain Dialogue Systems

Arxiv

11+阅读 · 2019年11月4日

Meta-World: A Benchmark and Evaluation for Multi-Task and Meta Reinforcement Learning

Meta-World: A Benchmark and Evaluation for Multi-Task and Meta Reinforcement Learning

Arxiv

34+阅读 · 2019年10月24日

Commonsense Reasoning for Natural Language Understanding: A Survey of Benchmarks, Resources, and Approaches

Arxiv

16+阅读 · 2019年4月2日

BERT for Joint Intent Classification and Slot Filling

Arxiv

13+阅读 · 2019年2月28日

Multi-task Learning for Universal Sentence Embeddings: A Thorough Evaluation using Transfer and Auxiliary Tasks

Multi-task Learning for Universal Sentence Embeddings: A Thorough Evaluation using Transfer and Auxiliary Tasks

Arxiv

3+阅读 · 2018年8月16日

VIP会员

文章信息

相关主题

会话智能体

state-of-the-art

相关VIP内容

中国数据要素市场发展报告（2020~2021），65页pdf

中国数据要素市场发展报告（2020~2021），65页pdf

专知会员服务

142+阅读 · 2021年5月11日

自然语言处理中的注意力机制，Attention in Natural Language Processing

自然语言处理中的注意力机制，Attention in Natural Language Processing

专知会员服务

136+阅读 · 2020年5月30日

【清华大学】面向任务的对话系统的最新进展和挑战，Task-oriented Dialog System

【清华大学】面向任务的对话系统的最新进展和挑战，Task-oriented Dialog System

专知会员服务

84+阅读 · 2020年3月24日

深度学习自然语言处理综述论文，Natural Language Processing Advancements By Deep Learning: A Survey

深度学习自然语言处理综述论文，Natural Language Processing Advancements By Deep Learning: A Survey

专知会员服务

80+阅读 · 2020年3月5日

【微软雷德蒙研究院】小样本自然语言生成，Few-shot Natural Language Generation for Task-Oriented Dialog

【微软雷德蒙研究院】小样本自然语言生成，Few-shot Natural Language Generation for Task-Oriented Dialog

专知会员服务

33+阅读 · 2020年2月29日

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

10+阅读 · 2019年10月24日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

《2024年度美国防部作战测试与评估报告》500页

《面相未来作战空中系统中有人-无人编组的AI驱动协作模式选择》含slides

无人机编队飞行：复杂环境中作战的策略、挑战与应用

《探索军事背景下共享大语言模型：AI助手与智能体部署中可扩展性与效率的早期洞察》（含44页slides）

相关资讯

已删除

将门创投

3+阅读 · 2019年6月12日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

Nature 一周论文导读 | 2018 年 5 月 24 日

Nature 一周论文导读 | 2018 年 5 月 24 日

科研圈

11+阅读 · 2018年5月27日

推荐｜深度强化学习聊天机器人（附论文）！

推荐｜深度强化学习聊天机器人（附论文）！

全球人工智能

4+阅读 · 2018年1月30日

计算机视觉近一年进展综述

计算机视觉近一年进展综述

机器学习研究会

9+阅读 · 2017年11月25日

【推荐】用Python/OpenCV实现增强现实

【推荐】用Python/OpenCV实现增强现实

机器学习研究会

15+阅读 · 2017年11月16日

相关论文

Text is no more Enough! A Benchmark for Profile-based Spoken Language Understanding

Arxiv

0+阅读 · 2022年1月12日

How Does Data Corruption Affect Natural Language Understanding Models? A Study on GLUE datasets

How Does Data Corruption Affect Natural Language Understanding Models? A Study on GLUE datasets

Arxiv

0+阅读 · 2022年1月12日

Text Detection and Recognition in the Wild: A Review

Arxiv

20+阅读 · 2020年6月8日

Reverse Engineering Configurations of Neural Text Generation Models

Arxiv

5+阅读 · 2020年4月13日

Few-shot Natural Language Generation for Task-Oriented Dialog

Few-shot Natural Language Generation for Task-Oriented Dialog

Arxiv

30+阅读 · 2020年2月27日

Predictive Engagement: An Efficient Metric For Automatic Evaluation of Open-Domain Dialogue Systems

Predictive Engagement: An Efficient Metric For Automatic Evaluation of Open-Domain Dialogue Systems

Arxiv

11+阅读 · 2019年11月4日

Meta-World: A Benchmark and Evaluation for Multi-Task and Meta Reinforcement Learning

Meta-World: A Benchmark and Evaluation for Multi-Task and Meta Reinforcement Learning

Arxiv

34+阅读 · 2019年10月24日

Commonsense Reasoning for Natural Language Understanding: A Survey of Benchmarks, Resources, and Approaches

Arxiv

16+阅读 · 2019年4月2日

BERT for Joint Intent Classification and Slot Filling

Arxiv

13+阅读 · 2019年2月28日

Multi-task Learning for Universal Sentence Embeddings: A Thorough Evaluation using Transfer and Auxiliary Tasks

Multi-task Learning for Universal Sentence Embeddings: A Thorough Evaluation using Transfer and Auxiliary Tasks

Arxiv

3+阅读 · 2018年8月16日

微信扫码咨询专知VIP会员