The recent emergence of deepfake videos leads to an important societal question: how can we know if a video that we watch is real or fake? In three online studies with 15,016 participants, we present authentic videos and deepfakes and ask participants to identify which is which. We compare the performance of ordinary participants against the leading computer vision deepfake detection model and find them similarly accurate while making different kinds of mistakes. Together, participants with access to the model's prediction are more accurate than either alone, but inaccurate model predictions often decrease participants' accuracy. We embed randomized experiments and find: incidental anger decreases participants' performance and obstructing holistic visual processing of faces also hinders participants' performance while mostly not affecting the model's. These results suggest that considering emotional influences and harnessing specialized, holistic visual processing of ordinary people could be promising defenses against machine-manipulated media.
翻译:最近深假视频的出现引出了一个重要的社会问题:我们如何知道我们观看的视频是真实的还是假的?在有15 016名参与者参与的三项在线研究中,我们展示了真实的视频和深假,并请参与者确定哪个是真实的视频和深假的视频。我们将普通参与者的表现与领先的计算机视觉深假检测模型进行比较,并发现它们同样准确,同时犯下了各种错误。 一起,能够获取模型预测的参与者比单独参与者更准确,但不准确的模型预测往往降低参与者的准确性。我们引入随机化的实验并发现:附带的愤怒会降低参与者的性能,阻碍整体面部的视觉处理也会妨碍参与者的表现,但大多不会影响模型的绩效。这些结果表明,考虑到情感影响和对普通人进行专门、全面的视觉处理,对机器操纵的媒体进行防御可能是有希望的。