Deep neural networks have great representation power, but typically require large numbers of training examples. This motivates deep active learning methods that can significantly reduce the amount of labeled training data. Empirical successes of deep active learning have been recently reported in the literature, however, rigorous label complexity guarantees of deep active learning have remained elusive. This constitutes a significant gap between theory and practice. This paper tackles this gap by providing the first near-optimal label complexity guarantees for deep active learning. The key insight is to study deep active learning from the nonparametric classification perspective. Under standard low noise conditions, we show that active learning with neural networks can provably achieve the minimax label complexity, up to disagreement coefficient and other logarithmic terms. When equipped with an abstention option, we further develop an efficient deep active learning algorithm that achieves $\mathsf{polylog}(\frac{1}{\epsilon})$ label complexity, without any low noise assumptions. We also provide extensions of our results beyond the commonly studied Sobolev/H\"older spaces and develop label complexity guarantees for learning in Radon $\mathsf{BV}^2$ spaces, which have recently been proposed as natural function spaces associated with neural networks.
翻译:深心神经网络具有巨大的代表力, 但通常需要大量的培训实例。 这鼓励了深层的积极学习方法, 可以大幅降低标签培训数据的数量。 文献中最近报告了深度积极学习的经验成功经验, 然而, 深度积极学习的严格标签复杂性保障仍然难以实现。 这是理论与实践之间的巨大差距。 本文通过为深积极学习提供第一个近乎最佳标签复杂性保障来弥补这一差距。 关键洞察力是从非参数分类角度研究深度积极学习。 在标准的低噪音条件下, 我们显示与神经网络的积极学习可以顺利地实现迷你马克斯标签的复杂性, 直至差异系数和其他对数术语。 在配有弃权选项时, 我们进一步开发高效的深度积极学习算法, 实现$\ mathfsf{poly} (\ frac{ 1unepslon} ) 的标签复杂性, 而不设任何低噪音假设。 我们还将我们的成果扩展到通常研究的 Sobolev/ H\\ “ 老空格” 空间, 开发标签复杂性保障, 在Radon $\\ math 2 的自然空间中, 这是最近提议的自然空间。