Central to active learning (AL) is what data should be selected for annotation. Existing works attempt to select highly uncertain or informative data for annotation. Nevertheless, it remains unclear how selected data impacts the test performance of the task model used in AL. In this work, we explore such an impact by theoretically proving that selecting unlabeled data of higher gradient norm leads to a lower upper-bound of test loss, resulting in better test performance. However, due to the lack of label information, directly computing gradient norm for unlabeled data is infeasible. To address this challenge, we propose two schemes, namely expected-gradnorm and entropy-gradnorm. The former computes the gradient norm by constructing an expected empirical loss while the latter constructs an unsupervised loss with entropy. Furthermore, we integrate the two schemes in a universal AL framework. We evaluate our method on classical image classification and semantic segmentation tasks. To demonstrate its competency in domain applications and its robustness to noise, we also validate our method on a cellular imaging analysis task, namely cryo-Electron Tomography subtomogram classification. Results demonstrate that our method achieves superior performance against the state of the art. Our source code is available at https://github.com/xulabs/aitom/blob/master/doc/projects/al_gradnorm.md.
翻译:积极学习的核心(AL) 是应该选择哪些数据来进行批注。 现有的工作试图选择高度不确定或信息性能的数据来进行批注。 尽管如此, 仍然不清楚所选数据是如何影响AL使用的任务模式的测试性能的。 在这项工作中, 我们从理论上证明选择高梯度规范的未贴标签数据会导致较低的测试损失上限, 从而导致更好的测试性能。 但是, 由于缺乏标签信息, 直接计算未贴标签的数据的梯度标准是不可行的。 为了应对这一挑战, 我们提出了两种方案, 即预期- gradronm 和 entropy- gradorn。 前者通过构建预期的经验性损失来计算梯度规范, 而后者则构建了一种不受监督的损失。 此外, 我们将这两种方案纳入一个普遍的AL框架。 我们评估经典图像分类和语系分解任务的方法。 为了显示其在域应用程序上的能力和对噪音的坚固性能, 我们还验证了我们关于细胞成像分析任务的方法, 即 Coo- Elecron commagraphy Subtual_logmabal disal disal.