用于理解电影故事的读写记忆网络 (A Read-Write Memory Network for Movie Story Understanding)

We propose a novel memory network model named Read-Write Memory Network (RWMN) to perform question and answering tasks for large-scale, multimodal movie story understanding. The key focus of our RWMN model is to design the read network and the write network that consist of multiple convolutional layers, which enable memory read and write operations to have high capacity and flexibility. While existing memory-augmented network models treat each memory slot as an independent block, our use of multi-layered CNNs allows the model to read and write sequential memory cells as chunks, which is more reasonable to represent a sequential story because adjacent memory blocks often have strong correlations. For evaluation, we apply our model to all the six tasks of the MovieQA benchmark, and achieve the best accuracies on several tasks, especially on the visual QA task. Our model shows a potential to better understand not only the content in the story, but also more abstract information, such as relationships between characters and the reasons for their actions.

翻译：我们建议建立一个名为“读写记忆网”的新记忆网络模型(RWMN),用于为大规模、多式电影故事理解进行问答。我们的RWMN模型的关键重点是设计由多个进化层组成的读网络和写作网络,使记忆读写操作具有很高的容量和灵活性。虽然现有的记忆增强网络模型将每个记忆区视为一个独立的区块,但我们使用多层CNN,使得该模型可以作为块来读写相继的记忆单元格,这更合理,因为相邻的记忆区往往具有很强的关联性,从而代表一个相继的故事。为了评估,我们将我们的模型应用于电影QA基准的所有六项任务,并实现若干任务上的最佳理解,特别是视觉QA任务上的最佳理解。我们的模型展示了一种潜力,不仅更好地理解故事的内容,而且更好地了解更抽象的信息,例如字符之间的关系及其行动的原因。

相关内容

Networking

关注 22

Networking：IFIP International Conferences on Networking。 Explanation：国际网络会议。 Publisher：IFIP。 SIT： http://dblp.uni-trier.de/db/conf/networking/index.html

【CMU】图卷积神经网络中的池化综述，Pooling in Graph Convolutional Neural Network

专知会员服务

46+阅读 · 2020年4月8日

【AAAI2020-清华大学】张量图卷积网络文本分类，Tensor Graph Convolutional Networks for Text Classification

专知会员服务

76+阅读 · 2020年1月16日

预训练语言模型究竟捕获了什么？（oLMpics - On what Language Model Pre-training Captures）

专知会员服务

14+阅读 · 2020年1月3日

【AAAI2020】用于视觉对话中深度视觉理解的自适应双向编码模型（DualVD: An Adaptive Dual Encoding Model for Deep Visual Understanding in Visual Dialogue）, 中科院信工所于静等

专知会员服务

29+阅读 · 2019年11月23日