The recent emergence of machine-manipulated media raises an important societal question: how can we know if a video that we watch is real or fake? In two online studies with 15,016 participants, we present authentic videos and deepfakes and ask participants to identify which is which. We compare the performance of ordinary human observers against the leading computer vision deepfake detection model and find them similarly accurate while making different kinds of mistakes. Together, participants with access to the model's prediction are more accurate than either alone, but inaccurate model predictions often decrease participants' accuracy. To probe the relative strengths and weaknesses of humans and machines as detectors of deepfakes, we examine human and machine performance across video-level features, and we evaluate the impact of pre-registered randomized interventions on deepfake detection. We find that manipulations designed to disrupt visual processing of faces hinder human participants' performance while mostly not affecting the model's performance, suggesting a role for specialized cognitive capacities in explaining human deepfake detection performance.
翻译:最近机器管理媒体的出现提出了一个重要的社会问题:我们如何知道我们所观看的视频是真实的还是假的?在有15 016名参与者参与的两次在线研究中,我们展示了真实的视频和深假,并请参与者确定哪个是真实的视频和深假。我们比较了普通人类观察员的表现与主要的计算机视觉深假探测模型的对比,发现它们同样准确,同时犯下了各种错误。一起,能够使用模型预测的参与者比单独参与者更准确,但不准确的模型预测往往降低参与者的准确性。为了调查人类和机器作为深假人探测器的相对优缺点,我们检查了人类和机器在视频层面的性能,我们评估了预先登记的随机干预对深假探测的影响。我们发现,旨在干扰面部图像处理的操纵阻碍了人类参与者的表现,但大多不影响模型的性能,我们建议了专门认知能力在解释人类深假检测性能方面的作用。