Measures of face-identification proficiency are essential to ensure accurate and consistent performance by professional forensic face examiners and others who perform face-identification tasks in applied scenarios. Current proficiency tests rely on static sets of stimulus items, and so, cannot be administered validly to the same individual multiple times. To create a proficiency test, a large number of items of "known" difficulty must be assembled. Multiple tests of equal difficulty can be constructed then using subsets of items. We introduce the Triad Identity Matching (TIM) test and evaluate it using Item Response Theory (IRT). Participants view face-image "triads" (N=225) (two images of one identity, one image of a different identity) and select the different identity. In Experiment 1, university students (N=197) showed wide-ranging accuracy on the TIM test, and IRT modeling demonstrated that the TIM items span various difficulty levels. In Experiment 2, we used IRT-based item metrics to partition the test into subsets of specific difficulties. Simulations showed that subsets of the TIM items yielded reliable estimates of subject ability. In Experiments 3a and 3b, we found that the student-derived IRT model reliably evaluated the ability of non-student participants and that ability generalized across different test sessions. In Experiment 3c, we show that TIM test performance correlates with other common face-recognition tests. In summary, the TIM test provides a starting point for developing a framework that is flexible and calibrated to measure proficiency across various ability levels (e.g., professionals or populations with face-processing deficits).
翻译:专业法医面部检查师和其他在应用情景中执行面部识别任务的人,为确保准确和一致的绩效,必须采用面对面熟练度的衡量方法,以确保专业法医面部检查师和其他在应用情景中执行面部识别任务的人员的准确和一致性能。当前熟练度测试依靠静态刺激项目进行,因此,无法对同一个体进行多次有效管理。为创建熟练度测试,必须收集大量“已知”困难的物品。随后,可以使用项目子集构建多种同等难度的测试。我们采用基于三边身份匹配(TIM)测试,并使用项目反应理论(IRT)评估测试。参与者查看面部“三边”(N=225)(两张一个身份的图像,一个不同身份的图像),并选择不同的身份。在实验1中,大学生(N=197)显示TIM测试的准确度非常广泛,而IM模拟模型显示TIM项目在不同的难度水平上,我们用基于三边的量度测试标准将测试分为具体困难的子组。模拟显示,TIM项目组的准确性估算对主题能力进行了可靠的估计。在T级和三边端点测试中,我们发现,测试参与者们有可靠地测试能力。