When groups of people are tasked with making a judgment, the issue of uncertainty often arises. Existing methods to reduce uncertainty typically focus on iteratively improving specificity in the overall task instruction. However, uncertainty can arise from multiple sources, such as ambiguity of the item being judged due to limited context, or disagreements among the participants due to different perspectives and an under-specified task. A one-size-fits-all intervention may be ineffective if it is not targeted to the right source of uncertainty. In this paper we introduce a new workflow, Judgment Sieve, to reduce uncertainty in tasks involving group judgment in a targeted manner. By utilizing measurements that separate different sources of uncertainty during an initial round of judgment elicitation, we can then select a targeted intervention adding context or deliberation to most effectively reduce uncertainty on each item being judged. We test our approach on two tasks: rating word pair similarity and toxicity of online comments, showing that targeted interventions reduced uncertainty for the most uncertain cases. In the top 10% of cases, we saw an ambiguity reduction of 21.4% and 25.7%, and a disagreement reduction of 22.2% and 11.2% for the two tasks respectively. We also found through a simulation that our targeted approach reduced the average uncertainty scores for both sources of uncertainty as opposed to uniform approaches where reductions in average uncertainty from one source came with an increase for the other.
翻译:当群体被要求做出裁定时,通常会出现不确定性的问题。现有的减少不确定性的方法通常专注于逐步改进整个任务指令中的具体性。但是,不确定性可以由多个来源引起,例如由于受限上下文而造成的项目的模糊性,或由于不同的观点和未明确任务而导致的参与者之间的分歧。如果不定向地采取通用干预可能会无效,因为它没有针对正确的不确定性来源。在本文中,我们引入了一种新的工作流程,即裁定筛选器,以有针对性地减少涉及群体裁定的任务中的不确定性。通过利用在初始裁定引发过程中分离不同不确定性来源的测量,我们可以选择有针对性的干预,添加上下文或审议,以最有效地减少每个裁定项目的不确定性。我们在两个任务上测试了我们的方法:评价词对相似度和在线评论的毒性,表明有针对性的干预减少了最不确定案件的不确定性。在前10%的案件中,我们分别看到了模糊度降低了21.4%和25.7%,分歧降低了22.2%和11.2%。我们还通过模拟发现,与统一方法相比,我们的针对性方法降低了来自两种不确定性来源的平均不确定度得分。