Dungeons和数据:大型网络数据包数据集 (Dungeons and Data: A Large-Scale NetHack Dataset)

from arxiv, 9 pages, to be published in the Proceedings of the 36th Conference on Neural Information Processing Systems (NeurIPS 2022) Track on Datasets and Benchmarks

Recent breakthroughs in the development of agents to solve challenging sequential decision making problems such as Go, StarCraft, or DOTA, have relied on both simulated environments and large-scale datasets. However, progress on this research has been hindered by the scarcity of open-sourced datasets and the prohibitive computational cost to work with them. Here we present the NetHack Learning Dataset (NLD), a large and highly-scalable dataset of trajectories from the popular game of NetHack, which is both extremely challenging for current methods and very fast to run. NLD consists of three parts: 10 billion state transitions from 1.5 million human trajectories collected on the NAO public NetHack server from 2009 to 2020; 3 billion state-action-score transitions from 100,000 trajectories collected from the symbolic bot winner of the NetHack Challenge 2021; and, accompanying code for users to record, load and stream any collection of such trajectories in a highly compressed form. We evaluate a wide range of existing algorithms including online and offline RL, as well as learning from demonstrations, showing that significant research advances are needed to fully leverage large-scale datasets for challenging sequential decision making tasks.

翻译：Go、StarCraft或DOTA等机构在开发解决具有挑战性的连续决策问题的代理人方面的近期突破,依赖模拟环境和大规模数据集,然而,由于公开源码数据集的缺乏以及与之合作的计算成本过高,这一研究的进展受到阻碍。这里我们展示了NetHack学习数据集(NLD),这是来自NetHack流行游戏(NetHack)的大规模和高度可扩缩的轨迹数据集,对于目前的方法来说,这都是极具挑战性的,而且运行速度非常快。全国民主联盟由三部分组成:从2009年至2020年在NAO公共NetHack服务器上收集的150万个人类轨迹从150万个州级转换到150万个州级转换;从NetHack挑战2021号象征性赢家收集的100 000个轨迹上30亿个州级行动核心转换,显示用户以高度压缩的形式记录、装载和流传任何此类轨迹的代码。我们评估了广泛的现有算法,包括在线和离线式RL,以及从从具有挑战性的连续分析需要的大规模进展,以充分显示从具有挑战性的连续优势的研究。

相关内容

数据集

关注 88

数据集，又称为资料集、数据集合或资料集合，是一种由数据所组成的集合。
Data set（或dataset）是一个数据的集合，通常以表格形式出现。每一列代表一个特定变量。每一行都对应于某一成员的数据集的问题。它列出的价值观为每一个变量，如身高和体重的一个物体或价值的随机数。每个数值被称为数据资料。对应于行数，该数据集的数据可能包括一个或多个成员。

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

最新《自监督表示学习》报告，70页ppt

专知会员服务

86+阅读 · 2020年12月22日

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日