Cognitive psychologists have documented that humans use cognitive heuristics, or mental shortcuts, to make quick decisions while expending less effort. While performing annotation work on crowdsourcing platforms, we hypothesize that such heuristic use among annotators cascades on to data quality and model robustness. In this work, we study cognitive heuristic use in the context of annotating multiple-choice reading comprehension datasets. We propose tracking annotator heuristic traces, where we tangibly measure low-effort annotation strategies that could indicate usage of various cognitive heuristics. We find evidence that annotators might be using multiple such heuristics, based on correlations with a battery of psychological tests. Importantly, heuristic use among annotators determines data quality along several dimensions: (1) known biased models, such as partial input models, more easily solve examples authored by annotators that rate highly on heuristic use, (2) models trained on annotators scoring highly on heuristic use don't generalize as well, and (3) heuristic-seeking annotators tend to create qualitatively less challenging examples. Our findings suggest that tracking heuristic usage among annotators can potentially help with collecting challenging datasets and diagnosing model biases.
翻译:认知心理学家已经记录了人类在花费较少的精力的同时使用认知超力或精神捷径来做出快速决策。 在对众包平台进行批注时,我们假设在批注器中这种超力使用会升级到数据质量和模型坚固度。 在这项工作中,我们研究在批注多选择阅读理解数据集背景下的认知超力使用。我们建议跟踪批注器超力偏移痕迹,其中我们明显测量低效批注战略,以显示各种认知超力的使用情况。我们发现证据表明,批注器可能使用多种此类超力学,这基于与心理测试组合的关联。 重要的是,批注器中的超力使用决定了数据质量。 (1) 已知的偏差模型,如部分输入模型,比较容易解析。 我们提议跟踪高压力使用超能度批量测量其使用量的批注器的模型。 (3) 标注器的辨识器使用极低性分析师会建议使用具有挑战性的数据。