Active learning promises to alleviate the massive data needs of supervised machine learning: it has successfully improved sample efficiency by an order of magnitude on traditional tasks like topic classification and object recognition. However, we uncover a striking contrast to this promise: across 5 models and 4 datasets on the task of visual question answering, a wide variety of active learning approaches fail to outperform random selection. To understand this discrepancy, we profile 8 active learning methods on a per-example basis, and identify the problem as collective outliers -- groups of examples that active learning methods prefer to acquire but models fail to learn (e.g., questions that ask about text in images or require external knowledge). Through systematic ablation experiments and qualitative visualizations, we verify that collective outliers are a general phenomenon responsible for degrading pool-based active learning. Notably, we show that active learning sample efficiency increases significantly as the number of collective outliers in the active learning pool decreases. We conclude with a discussion and prescriptive recommendations for mitigating the effects of these outliers in future work.
翻译:主动学习承诺减轻监督机学习的大量数据需求:它成功地通过专题分类和对象识别等传统任务规模的顺序提高了抽样效率。然而,我们发现与这一承诺形成鲜明对比:在5个模型和4个关于直观回答任务的数据集中,各种积极学习方法都未能优于随机选择。为了理解这一差异,我们以每个实例的方式描述8个积极学习方法,并找出问题,作为集体外科者 -- -- 积极学习方法倾向于获取但模型不学习的一组例子(例如,在图像中询问文字或需要外部知识的问题)。我们通过系统化的模拟试验和定性可视化,核实集体外科者是造成基于集体积极学习减少现象的普遍现象。值得注意的是,我们表明随着积极学习库的集体外科者人数减少,积极学习的抽样效率显著提高。我们最后提出了减少这些外科者在未来工作中的影响的讨论和规范性建议。