以语言任务为目的的ESN 和 LSTM 视觉化的旅程 (A journey in ESN and LSTM visualisations on a language task)

Echo States Networks (ESN) and Long-Short Term Memory networks (LSTM) are two popular architectures of Recurrent Neural Networks (RNN) to solve machine learning task involving sequential data. However, little have been done to compare their performances and their internal mechanisms on a common task. In this work, we trained ESNs and LSTMs on a Cross-Situationnal Learning (CSL) task. This task aims at modelling how infants learn language: they create associations between words and visual stimuli in order to extract meaning from words and sentences. The results are of three kinds: performance comparison, internal dynamics analyses and visualization of latent space. (1) We found that both models were able to successfully learn the task: the LSTM reached the lowest error for the basic corpus, but the ESN was quicker to train. Furthermore, the ESN was able to outperform LSTMs on datasets more challenging without any further tuning needed. (2) We also conducted an analysis of the internal units activations of LSTMs and ESNs. Despite the deep differences between both models (trained or fixed internal weights), we were able to uncover similar inner mechanisms: both put emphasis on the units encoding aspects of the sentence structure. (3) Moreover, we present \textit{Recurrent States Space Visualisations} (RSSviz), a method to visualize the structure of latent state space of RNNs, based on dimension reduction (using UMAP). This technique enables us to observe a fractal embedding of sequences in the LSTM. RSSviz is also useful for the analysis of ESNs (i) to spot difficult examples and (ii) to generate animated plots showing the evolution of activations across learning stages. Finally, we explore qualitatively how the RSSviz could provide an intuitive visualisation to understand the influence of hyperparameters on the reservoir dynamics prior to ESN training.

翻译：热心国家网络(ESN)和长短时间内存网络(LSTM)是经常神经网络(RNN)的两个受欢迎的结构,用来解决涉及序列数据的机器学习任务。然而,在比较其性能和内部机制的共同任务方面,我们没有做多少工作。在这项工作中,我们训练了ESN和LSTMS进行跨度学习(CSL)任务。这个任务旨在模拟婴儿如何学习语言:它们建立文字和视觉模拟之间的关联,以便从文字和句子中提取含义。结果分为三类:性能比较、内部动态分析以及潜伏空间的视觉化。(1) 我们发现两种模型都能够成功地学习这个任务:LSTM达到基本任务的最低错误,但ESNU培训速度更快。此外,ENS能够在无需进一步调整的情况下,在数据集上比LSTMM(RSTMS)和 ESNS(S)内部单位的激活作用。尽管两个模型之间都存在深刻的差异(对内部重量进行训练或固定的),我们还是能够发现EVRM(S)的深度分析。

相关内容

长短期记忆网络

关注 120

长短期记忆网络(LSTM)是一种用于深度学习领域的人工回归神经网络(RNN)结构。与标准的前馈神经网络不同，LSTM具有反馈连接。它不仅可以处理单个数据点(如图像)，还可以处理整个数据序列(如语音或视频)。例如，LSTM适用于未分段、连接的手写识别、语音识别、网络流量或IDSs(入侵检测系统)中的异常检测等任务。

剑桥大学《数据科学: 原理与实践》课程，附PPT下载

专知会员服务

53+阅读 · 2021年1月20日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

神经网络序列数据建模，229页ppt，Modeling Sequential Data with Neural Nets

专知会员服务

67+阅读 · 2020年7月25日