自控网络如何识别Dyck-n语言? (How Can Self-Attention Networks Recognize Dyck-n Languages?) - 专知论文

会员服务 ·

0

Performer · Networking · Networks · 学成 · 模型评估 ·

2020 年 10 月 9 日

How Can Self-Attention Networks Recognize Dyck-n Languages?

翻译：自控网络如何识别Dyck-n语言?

Javid Ebrahimi,Dhruv Gelda,Wei Zhang

We focus on the recognition of Dyck-n ($\mathcal{D}_n$) languages with self-attention (SA) networks, which has been deemed to be a difficult task for these networks. We compare the performance of two variants of SA, one with a starting symbol (SA$^+$) and one without (SA$^-$). Our results show that SA$^+$ is able to generalize to longer sequences and deeper dependencies. For $\mathcal{D}_2$, we find that SA$^-$ completely breaks down on long sequences whereas the accuracy of SA$^+$ is 58.82$\%$. We find attention maps learned by $\text{SA}{^+}$ to be amenable to interpretation and compatible with a stack-based language recognizer. Surprisingly, the performance of SA networks is at par with LSTMs, which provides evidence on the ability of SA to learn hierarchies without recursion.

翻译：我们把注意力集中在承认具有自我关注网络(SA)的Dyck-n(mathcal{D ⁇ n$)语言上,这被认为是这些网络的一项困难任务。我们比较了SA的两个变体的性能,一个变体的开头符号(SA$)和一个变体的性能(SA$-$),一个变体没有(SA$),我们的结果表明SA$能够概括更长的序列和更深的依附关系。对于$mathcal{D ⁇ 2$,我们发现SA$在长顺序上完全崩溃,而SA$的精确度是58.82$。我们发现用$(SA ⁇ $)所学的注意地图可以用于解释和与一个基于堆叠的语言识别器兼容。令人惊讶的是,SA网络的性能与LSTMs相当,它提供了SA在不重复的情况下学习高分立体的能力的证据。

0

相关内容

Performer

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

最新《人脸识别对抗攻击》综述 | Threat of Adversarial Attacks on Face Recognition: A Comprehensive Survey

最新《人脸识别对抗攻击》综述 | Threat of Adversarial Attacks on Face Recognition: A Comprehensive Survey

专知会员服务

26+阅读 · 2020年7月24日

神经网络与形式语言综述，12页pdf，A Survey of Neural Networks and Formal Languages

神经网络与形式语言综述，12页pdf，A Survey of Neural Networks and Formal Languages

专知会员服务

21+阅读 · 2020年6月4日

【CVPR2020】视频符号语言识别中跨领域知识的传递, Transferring Cross-domain Knowledge for Video Sign Language Recognition

【CVPR2020】视频符号语言识别中跨领域知识的传递, Transferring Cross-domain Knowledge for Video Sign Language Recognition

专知会员服务

9+阅读 · 2020年4月17日

BERT到底如何work的？A Primer in BERTology: What we know about how BERT works

BERT到底如何work的？A Primer in BERTology: What we know about how BERT works

专知会员服务

50+阅读 · 2020年2月28日

【论文】利用Python开发长短时记忆网络，利用深度学习开发序列预测模型（Long Short-Term Memory Networks With Python，Develop Sequence Prediction Models With Deep Learning），246页pdf

【论文】利用Python开发长短时记忆网络，利用深度学习开发序列预测模型（Long Short-Term Memory Networks With Python，Develop Sequence Prediction Models With Deep Learning），246页pdf

专知会员服务

52+阅读 · 2020年1月1日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

【TED】生命中的每一年的智慧

【TED】生命中的每一年的智慧

英语演讲视频每日一推

10+阅读 · 2019年1月29日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

计算机视觉近一年进展综述

计算机视觉近一年进展综述

机器学习研究会

9+阅读 · 2017年11月25日

【音乐】Attention

【音乐】Attention

英语演讲视频每日一推

3+阅读 · 2017年8月22日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

Speech2Action: Cross-modal Supervision for Action Recognition

Speech2Action: Cross-modal Supervision for Action Recognition

Arxiv

7+阅读 · 2020年3月30日

A Comparative Study on Transformer vs RNN in Speech Applications

A Comparative Study on Transformer vs RNN in Speech Applications

Arxiv

4+阅读 · 2019年9月13日

Language Modeling with Deep Transformers

Arxiv

6+阅读 · 2019年7月11日

How to Fine-Tune BERT for Text Classification?

How to Fine-Tune BERT for Text Classification?

Arxiv

13+阅读 · 2019年5月14日

Cloze-driven Pretraining of Self-attention Networks

Arxiv

6+阅读 · 2019年3月19日

Learning with Interpretable Structure from RNN

Arxiv

19+阅读 · 2018年10月25日

End-to-end Speech Recognition with Word-based RNN Language Models

End-to-end Speech Recognition with Word-based RNN Language Models

Arxiv

3+阅读 · 2018年8月8日

Syllable-Based Sequence-to-Sequence Speech Recognition with the Transformer in Mandarin Chinese

Arxiv

5+阅读 · 2018年6月4日

Self-Attention Generative Adversarial Networks

Arxiv

8+阅读 · 2018年5月21日

State-of-the-art Speech Recognition With Sequence-to-Sequence Models

Arxiv

7+阅读 · 2018年1月18日

VIP会员

文章信息

相关主题

相关VIP内容

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

最新《人脸识别对抗攻击》综述 | Threat of Adversarial Attacks on Face Recognition: A Comprehensive Survey

最新《人脸识别对抗攻击》综述 | Threat of Adversarial Attacks on Face Recognition: A Comprehensive Survey

专知会员服务

26+阅读 · 2020年7月24日

神经网络与形式语言综述，12页pdf，A Survey of Neural Networks and Formal Languages

神经网络与形式语言综述，12页pdf，A Survey of Neural Networks and Formal Languages

专知会员服务

21+阅读 · 2020年6月4日

【CVPR2020】视频符号语言识别中跨领域知识的传递, Transferring Cross-domain Knowledge for Video Sign Language Recognition

【CVPR2020】视频符号语言识别中跨领域知识的传递, Transferring Cross-domain Knowledge for Video Sign Language Recognition

专知会员服务

9+阅读 · 2020年4月17日

BERT到底如何work的？A Primer in BERTology: What we know about how BERT works

BERT到底如何work的？A Primer in BERTology: What we know about how BERT works

专知会员服务

50+阅读 · 2020年2月28日

【论文】利用Python开发长短时记忆网络，利用深度学习开发序列预测模型（Long Short-Term Memory Networks With Python，Develop Sequence Prediction Models With Deep Learning），246页pdf

【论文】利用Python开发长短时记忆网络，利用深度学习开发序列预测模型（Long Short-Term Memory Networks With Python，Develop Sequence Prediction Models With Deep Learning），246页pdf

专知会员服务

52+阅读 · 2020年1月1日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

操作系统智能体：基于多模态大模型（MLLM）的通用计算设备智能体综述

《美国太空军系统全生命周期建模、仿真与分析效能提升方案》最新84页报告

【博士论文】推进数据高效的深度学习：非参数 Transformer、主动测试与上下文学习

自主人工智能：未来战争是否将是自主化的？

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

【TED】生命中的每一年的智慧

【TED】生命中的每一年的智慧

英语演讲视频每日一推

10+阅读 · 2019年1月29日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

计算机视觉近一年进展综述

计算机视觉近一年进展综述

机器学习研究会

9+阅读 · 2017年11月25日

【音乐】Attention

【音乐】Attention

英语演讲视频每日一推

3+阅读 · 2017年8月22日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

相关论文

Speech2Action: Cross-modal Supervision for Action Recognition

Speech2Action: Cross-modal Supervision for Action Recognition

Arxiv

7+阅读 · 2020年3月30日

A Comparative Study on Transformer vs RNN in Speech Applications

A Comparative Study on Transformer vs RNN in Speech Applications

Arxiv

4+阅读 · 2019年9月13日

Language Modeling with Deep Transformers

Arxiv

6+阅读 · 2019年7月11日

How to Fine-Tune BERT for Text Classification?

How to Fine-Tune BERT for Text Classification?

Arxiv

13+阅读 · 2019年5月14日

Cloze-driven Pretraining of Self-attention Networks

Arxiv

6+阅读 · 2019年3月19日

Learning with Interpretable Structure from RNN

Arxiv

19+阅读 · 2018年10月25日

End-to-end Speech Recognition with Word-based RNN Language Models

End-to-end Speech Recognition with Word-based RNN Language Models

Arxiv

3+阅读 · 2018年8月8日

Syllable-Based Sequence-to-Sequence Speech Recognition with the Transformer in Mandarin Chinese

Arxiv

5+阅读 · 2018年6月4日

Self-Attention Generative Adversarial Networks

Arxiv

8+阅读 · 2018年5月21日

State-of-the-art Speech Recognition With Sequence-to-Sequence Models

Arxiv

7+阅读 · 2018年1月18日

微信扫码咨询专知VIP会员