Reinforcement learning (RL) has recently shown impressive performance in complex game AI and robotics tasks. To a large extent, this is thanks to the availability of simulated environments such as OpenAI Gym, Atari Learning Environment, or Malmo which allow agents to learn complex tasks through interaction with virtual environments. While RL is also increasingly applied to natural language processing (NLP), there are no simulated textual environments available for researchers to apply and consistently benchmark RL on NLP tasks. With the work reported here, we therefore release NLPGym, an open-source Python toolkit that provides interactive textual environments for standard NLP tasks such as sequence tagging, multi-label classification, and question answering. We also present experimental results for 6 tasks using different RL algorithms which serve as baselines for further research. The toolkit is published at https://github.com/rajcscw/nlp-gym
翻译:强化学习(RL)近来在复杂的游戏AI和机器人任务中表现出了令人印象深刻的成绩。这在很大程度上要归功于可提供模拟环境,例如OpenAI Gym、Atari学习环境或Malmo,使代理商能够通过与虚拟环境的互动学习复杂的任务。虽然RL也越来越多地应用于自然语言处理(NLP),但研究人员没有模拟文本环境可以应用并一致地基准RL任务。在此报告的工作之后,我们释放了NLPGym,这是一个开放源的Python工具包,为标准的NLP任务提供互动的文本环境,例如序列标记、多标签分类和问题回答。我们还用不同的RL算法为6项任务提供实验结果,这些算法是进一步研究的基准。该工具包在 https://github.com/rajcscw/nlp-gym上公布。