We present an empirical dataset surveying the deep learning phenomenon on fully-connected networks, encompassing the training and test performance of numerous network topologies, sweeping across multiple learning tasks, depths, numbers of free parameters, learning rates, batch sizes, and regularization penalties. The dataset probes 178 thousand hyperparameter settings with an average of 20 repetitions each, totaling 3.5 million training runs and 20 performance metrics for each of the 13.1 billion training epochs observed. Accumulating this 671 GB dataset utilized 5,448 CPU core-years, 17.8 GPU-years, and 111.2 node-years. Additionally, we provide a preliminary analysis revealing patterns which persist across learning tasks and topologies. We aim to inspire work empirically studying modern machine learning techniques as a catalyst for the theoretical discoveries needed to progress the field beyond energy-intensive and heuristic practices.
翻译:我们提出了一个经验数据集,用于调查完全联网网络的深层学习现象,包括许多网络地形的培训和测试性能,涵盖多个学习任务、深度、免费参数数目、学习率、批量大小和正规化处罚。数据集探测了17.8万个超参数设置,每个设置平均重复20次,总共为所观察到的131亿个培训时代中的每一个时代提供了350万个培训运行和20个性能衡量标准。这671个GB数据集累积了5 448个CPU核心年、17.8个GPU年和111.2个节点年。此外,我们提供了初步分析,揭示了贯穿学习任务和意识形态的一贯模式。我们的目标是激励研究现代机器学习技术的实验性工作,以此推动在能源密集和超常化做法之外推进该领域的理论发现。