The recent emergence of deepfakes has brought manipulated and generated content to the forefront of machine learning research. Automatic detection of deepfakes has seen many new machine learning techniques, however, human detection capabilities are far less explored. In this paper, we present results from comparing the abilities of humans and machines for detecting audio deepfakes used to imitate someone's voice. For this, we use a web-based application framework formulated as a game. Participants were asked to distinguish between real and fake audio samples. In our experiment, 378 unique users competed against a state-of-the-art AI deepfake detection algorithm for 12540 total of rounds of the game. We find that humans and deepfake detection algorithms share similar strengths and weaknesses, both struggling to detect certain types of attacks. This is in contrast to the superhuman performance of AI in many application areas such as object detection or face recognition. Concerning human success factors, we find that IT professionals have no advantage over non-professionals but native speakers have an advantage over non-native speakers. Additionally, we find that older participants tend to be more susceptible than younger ones. These insights may be helpful when designing future cybersecurity training for humans as well as developing better detection algorithms.
翻译:最近深假体的出现将操纵和生成的内容带到了机器学习研究的前沿。对深假体的自动检测发现了许多新的机器学习技术,然而,人类的检测能力远没有那么深入。在本文中,我们介绍了通过比较人类和机器探测声音深假体的能力以模仿某人的声音而得出的结果。为此,我们使用一个以网络为基础的应用框架来设计一个游戏。与会者被要求区分真实和假音样。在我们实验中,378个独特的用户与最先进的AI深假体检测算法竞争,总共12540轮游戏。我们发现人类和深假探测算法具有相似的强项和弱项,两者都在努力探测某些类型的攻击。这与AI在许多应用领域的超人性表现形成对照,如物体检测或面貌认知。关于人类成功的因素,我们发现信息技术专业人员对非专业性产品没有优势,但母语发言人对非土著演讲者也有优势。此外,我们发现,老年参与者往往比年轻参与者更易受到同样的强项。这些洞察算法在设计未来网络安全培训时可能很有帮助。