弹性输入序列的适应性计算 (Adaptive Computation with Elastic Input Sequence)

When solving a problem, human beings have the adaptive ability in terms of the type of information they use, the procedure they take, and the amount of time they spend approaching and solving the problem. However, most standard neural networks have the same function type and fixed computation budget on different samples regardless of their nature and difficulty. Adaptivity is a powerful paradigm as it not only imbues practitioners with flexibility pertaining to the downstream usage of these models but can also serve as a powerful inductive bias for solving certain challenging classes of problems. In this work, we propose a new strategy, AdaTape, that enables dynamic computation in neural networks via adaptive tape tokens. AdaTape employs an elastic input sequence by equipping an existing architecture with a dynamic read-and-write tape. Specifically, we adaptively generate input sequences using tape tokens obtained from a tape bank that can either be trainable or generated from input data. We analyze the challenges and requirements to obtain dynamic sequence content and length, and propose the Adaptive Tape Reader (ATR) algorithm to achieve both objectives. Via extensive experiments on image recognition tasks, we show that AdaTape can achieve better performance while maintaining the computational cost.

翻译：解决问题时, 人类在使用的信息类型、程序、花费的时间上都具有适应能力。然而, 大多数标准的神经网络在不同的样本上都具有相同的功能类型和固定计算预算, 不论其性质和困难如何。适应性是一个强大的范例, 因为它不仅在下游使用这些模型方面具有灵活性, 而且还可以作为解决某些具有挑战性的问题类别的强烈感化偏差。在这项工作中, 我们提出了一个新战略, AdaTape, 通过适应性磁带符号在神经网络中进行动态计算。 AdaTape 使用弹性输入序列, 通过给现有结构配备动态读写磁带。具体地说, 我们利用磁带库获得的磁带符号来生成输入序列, 这些符号既可以培训,也可以从输入数据中生成。我们分析获得动态序列内容和长度的挑战和要求, 并提出适应性磁带阅读器算法可以实现两个目标。在图像识别任务上进行广泛的实验, 我们证明AdaTape在保持成本的同时, Adatape可以更好地实现性计算。

相关内容

Networking

关注 22

Networking：IFIP International Conferences on Networking。 Explanation：国际网络会议。 Publisher：IFIP。 SIT： http://dblp.uni-trier.de/db/conf/networking/index.html

NeurlPS 2022 | 自然语言处理相关论文分类整理

专知会员服务

51+阅读 · 2022年10月2日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日