变革者在决策变革者中有多重要? (How Crucial is Transformer in Decision Transformer?)

Decision Transformer (DT) is a recently proposed architecture for Reinforcement Learning that frames the decision-making process as an auto-regressive sequence modeling problem and uses a Transformer model to predict the next action in a sequence of states, actions, and rewards. In this paper, we analyze how crucial the Transformer model is in the complete DT architecture on continuous control tasks. Namely, we replace the Transformer by an LSTM model while keeping the other parts unchanged to obtain what we call a Decision LSTM model. We compare it to DT on continuous control tasks, including pendulum swing-up and stabilization, in simulation and on physical hardware. Our experiments show that DT struggles with continuous control problems, such as inverted pendulum and Furuta pendulum stabilization. On the other hand, the proposed Decision LSTM is able to achieve expert-level performance on these tasks, in addition to learning a swing-up controller on the real system. These results suggest that the strength of the Decision Transformer for continuous control tasks may lie in the overall sequential modeling architecture and not in the Transformer per se.

翻译：决策变换器(DT)是最近提出的强化学习架构,它将决策过程设定为自动递减序列模型问题,并使用变换模型来预测一系列状态、行动和奖励的下一步行动。在本文中,我们分析了变换器模型在完整的DT结构中对于连续控制任务的重要性。也就是说,我们用LSTM模型取代变换器,同时保持其他部分不变,以获得我们称之为LSTM模型的决定。我们在模拟和物理硬件方面将它与连续控制任务,包括平时回旋和稳定。我们的实验显示,变换器与连续控制问题,如倒转的平板和Furuta平板稳定,相挣扎。另一方面,拟议的LSTM决定能够在这些任务上取得专家级业绩,此外还在实际系统上学习回旋控制器。这些结果表明,持续控制任务的决定变换器的强度可能存在于总体顺序建模结构中,而不是在变换器本身。

相关内容

Continuity

关注 4

让 iOS 8 和 OS X Yosemite 无缝切换的一个新特性。 > Apple products have always been designed to work together beautifully. But now they may really surprise you. With iOS 8 and OS X Yosemite, you’ll be able to do more wonderful things than ever before.

Source: Apple - iOS 8

【决策Transformers 导论】Introducing Decision Transformers on Hugging Face 🤗

专知会员服务

68+阅读 · 2022年3月29日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日