The recent emergence of deepfakes has brought manipulated and generated content to the forefront of machine learning research. Automatic detection of deepfakes has seen many new machine learning techniques, however, human detection capabilities are far less explored. In this paper, we present results from comparing the abilities of humans and machines for detecting audio deepfakes used to imitate someone's voice. For this, we use a web-based application framework formulated as a game. Participants were asked to distinguish between real and fake audio samples. In our experiment, 472 unique users competed against a state-of-the-art AI deepfake detection algorithm for 14912 total of rounds of the game. We find that humans and deepfake detection algorithms share similar strengths and weaknesses, both struggling to detect certain types of attacks. This is in contrast to the superhuman performance of AI in many application areas such as object detection or face recognition. Concerning human success factors, we find that IT professionals have no advantage over non-professionals but native speakers have an advantage over non-native speakers. Additionally, we find that older participants tend to be more susceptible than younger ones. These insights may be helpful when designing future cybersecurity training for humans as well as developing better detection algorithms.
翻译:最近深假体的出现将操纵和生成的内容带到了机器学习研究的最前沿。对深假体的自动检测发现了许多新的机器学习技术,然而,人类的检测能力远没有那么深入。在本文中,我们介绍了比较人类能力和探测用于模仿某人声音的深假体的机器的结果。为此,我们使用一个以网络为基础的应用框架作为游戏。与会者被要求区分真实的和假的音频样本。在我们的实验中,472个独特的用户与一流的AI深假体检测算法竞争,共14912轮游戏。我们发现,人类和深假探测算法具有相似的强项和弱项。我们发现,人类和深假探测算法都具有相似的强项和弱项。这与AI在许多应用领域的超人性表现形成对照,如物体检测或面辨识。关于人类成功的因素,我们发现信息技术专业人员对非专业性而言没有优势,但母语发言人对非土著演讲者具有优势。此外,我们发现,老年参与者往往比较年轻的人更易受到比年轻者。这些洞察法在设计未来网络安全训练时可能是有用的。