Active learning (AL) is a promising ML paradigm that has the potential to parse through large unlabeled data and help reduce annotation cost in domains where labeling data can be prohibitive. Recently proposed neural network based AL methods use different heuristics to accomplish this goal. In this study, we demonstrate that under identical experimental settings, different types of AL algorithms (uncertainty based, diversity based, and committee based) produce an inconsistent gain over random sampling baseline. Through a variety of experiments, controlling for sources of stochasticity, we show that variance in performance metrics achieved by AL algorithms can lead to results that are not consistent with the previously reported results. We also found that under strong regularization, AL methods show marginal or no advantage over the random sampling baseline under a variety of experimental conditions. Finally, we conclude with a set of recommendations on how to assess the results using a new AL algorithm to ensure results are reproducible and robust under changes in experimental conditions. We share our codes to facilitate AL evaluations. We believe our findings and recommendations will help advance reproducible research in AL using neural networks. We open source our code at https://github.com/PrateekMunjal/TorchAL
翻译:积极学习(AL)是一个很有希望的 ML 模式,它有可能通过大型未贴标签的数据进行解析,并有助于降低标签数据可能令人望而却步的领域中的性能指标成本。最近提出的以神经网络为基础的AL 方法使用不同的累进法来实现这一目标。在这项研究中,我们证明在相同的实验环境中,不同类型的AL 算法(基于不确定性、多样性和委员会基础的)产生与随机抽样基线不一致的收益。通过各种实验,控制随机性源,我们发现AL 算法实现的性能指标差异可能导致与先前报告的结果不一致的结果。我们还发现,在严格的正规化下,AL 方法在各种实验条件下对随机抽样基线显示出边际优势或无优势。最后,我们提出了一系列关于如何利用新的AL 算法来评估结果的建议,以确保结果在实验条件下得到可再生和稳健健。我们分享了我们的代码,以便利AL 评估。我们相信我们的调查结果和建议将有助于在AL AL 使用神经网络推进再生研究。我们在 https://Mqual/togirstrate 我们的开放源代码在 AL/torrgarate/tographerate at.