Deep learning accelerators efficiently train over vast and growing amounts of data, placing a newfound burden on commodity networks and storage devices. A common approach to conserve bandwidth involves resizing or compressing data prior to training. We introduce Progressive Compressed Records (PCRs), a data format that uses compression to reduce the overhead of fetching and transporting data, effectively reducing the training time required to achieve a target accuracy. PCRs deviate from previous storage formats by combining progressive compression with an efficient storage layout to view a single dataset at multiple fidelities---all without adding to the total dataset size. We implement PCRs and evaluate them on a range of datasets, training tasks, and hardware architectures. Our work shows that: (i) the amount of compression a dataset can tolerate exceeds 50% of the original encoding for many DL training tasks; (ii) it is possible to automatically and efficiently select appropriate compression levels for a given task; and (iii) PCRs enable tasks to readily access compressed data at runtime---utilizing as little as half the training bandwidth and thus potentially doubling training speed.
翻译:深度学习加速器对大量且不断增长的数据进行高效培训,给商品网络和存储装置带来新的负担。一种常见的保护带宽做法是在培训前调整或压缩数据。我们引入了累进压缩记录(PCR),这是一种数据格式,使用压缩来减少提取和运输数据的间接费用,有效地减少了实现目标准确性所需的培训时间。PCR不同于以前的储存格式,将渐进压缩与高效的储存布局结合起来,以查看多个忠诚-全的单一数据集,而不增加数据集的总尺寸。我们实施了PCR,并对数据集、培训任务和硬件结构进行了评估。我们的工作显示:(一) 压缩一个数据集的数量可以容忍超过许多DL培训任务的原有编码的50%;(二) 有可能为某项任务自动和高效率地选择适当的压缩水平;(三) PCR使各项任务能够在运行时很容易地获取压缩数据,只达到一半的培训带宽度,从而可能使培训速度翻番。