Semi-supervised learning (SSL) improves model generalization by leveraging massive unlabeled data to augment limited labeled samples. However, currently, popular SSL evaluation protocols are often constrained to computer vision (CV) tasks. In addition, previous work typically trains deep neural networks from scratch, which is time-consuming and environmentally unfriendly. To address the above issues, we construct a Unified SSL Benchmark (USB) for classification by selecting 15 diverse, challenging, and comprehensive tasks from CV, natural language processing (NLP), and audio processing (Audio), on which we systematically evaluate the dominant SSL methods, and also open-source a modular and extensible codebase for fair evaluation of these SSL methods. We further provide the pre-trained versions of the state-of-the-art neural models for CV tasks to make the cost affordable for further tuning. USB enables the evaluation of a single SSL algorithm on more tasks from multiple domains but with less cost. Specifically, on a single NVIDIA V100, only 39 GPU days are required to evaluate FixMatch on 15 tasks in USB while 335 GPU days (279 GPU days on 4 CV datasets except for ImageNet) are needed on 5 CV tasks with TorchSSL.
翻译:半监督的学习(SSL)通过利用大量无标签的大规模数据来增加有限的标签样本,改进了模型的普及性。然而,目前流行的SSL评价协议往往局限于计算机视觉(CV)任务。此外,以往的工作通常从零开始训练深神经网络,这种网络耗时费时,环境不友好。为了解决上述问题,我们通过从CV、自然语言处理(NLP)和音频处理(Audio)中选择15项多样化、富有挑战性和综合性的任务,为分类建立一个统一的SSL基准(USB),以便从多个领域、自然语言处理(NLP)和音频处理(Audio)中选择一个单一的SLSL算法(Audio),我们系统评价占主导地位的SLSL方法,并开源一个模块和可扩展的代码库,以便公平评价这些SL方法的公平评价。我们还进一步为CVL任务提供了经过预先训练的先进神经模型版本,使费用能够进一步调整。USB在多个领域、自然语言处理(NVIA V100)上以较低成本评估一个任务,具体来说,需要39个GPUPU日,在USB15项任务中评价固定MatchMatch(Cxxxxxxxxxx5天,但需要335GPOL任务需要GPOL任务为GPOL),但GPOL工作需要的C4天。