Text games present opportunities for natural language understanding (NLU) methods to tackle reinforcement learning (RL) challenges. However, recent work has questioned the necessity of NLU by showing random text hashes could perform decently. In this paper, we pursue a fine-grained investigation into the roles of text in the face of different RL challenges, and reconcile that semantic and non-semantic language representations could be complementary rather than contrasting. Concretely, we propose a simple scheme to extract relevant contextual information into an approximate state hash as extra input for an RNN-based text agent. Such a lightweight plug-in achieves competitive performance with state-of-the-art text agents using advanced NLU techniques such as knowledge graph and passage retrieval, suggesting non-NLU methods might suffice to tackle the challenge of partial observability. However, if we remove RNN encoders and use approximate or even ground-truth state hash alone, the model performs miserably, which confirms the importance of semantic function approximation to tackle the challenge of combinatorially large observation and action spaces. Our findings and analysis provide new insights for designing better text game task setups and agents.
翻译:语言游戏为自然语言理解提供了解决强化学习(RL)挑战的自然语言理解(NLU)方法的机遇。然而,最近的工作质疑了NLU的必要性,即通过随机显示文本杂质来显示随机文本的功能,可以体面地发挥作用。在本文中,我们对面临不同RL挑战的文本作用进行细微调查,并调和语义和非语义语言表达可能是互补而不是对比的。具体地说,我们提出了一个简单方案,将相关背景信息作为基于 RNN 的文本代理器的额外输入,作为大约的散列提取。这种轻量插件利用先进的NLU 技术(如知识图表和通道检索)实现最先进的文本代理的竞争性性能,建议非NLU 方法可能足以应对部分易懂性的挑战。然而,如果我们去除 RNNN 编码,使用近似甚至地貌状态,模型就显得不合理,这证实了语义性,从而证明了对应对轮式大观测和行动空间挑战的精准性功能的重要性。我们的发现和分析为设计新的文本提供了新的洞察力。