Obtaining labeled data for machine learning tasks can be prohibitively expensive. Active learning mitigates this issue by exploring the unlabeled data space and prioritizing the selection of data that can best improve the model performance. A common approach to active learning is to pick a small sample of data for which the model is most uncertain. In this paper, we explore the efficacy of Bayesian neural networks for active learning, which naturally models uncertainty by learning distribution over the weights of neural networks. By performing a comprehensive set of experiments, we show that Bayesian neural networks are more efficient than ensemble based techniques in capturing uncertainty. Our findings also reveal some key drawbacks of the ensemble techniques, which was recently shown to be more effective than Monte Carlo dropouts.
翻译:为机器学习任务获取贴标签的数据可能费用太高,令人望而却步。积极学习通过探索未贴标签的数据空间和优先选择能够最好地改进模型性能的数据来缓解这一问题。积极学习的一个共同办法是挑选一个模型最不确定的少量数据样本。在本文中,我们探讨了巴伊西亚神经网络积极学习的效果,这些网络自然通过学习神经网络重量的分布来模拟不确定性。通过开展一系列全面的实验,我们发现拜伊西亚神经网络在捕捉不确定性方面比基于共性的技术效率更高。我们的调查结果还揭示了共性技术的一些关键缺陷,最近显示,共性技术比蒙特卡洛辍学者更有效。