The rise of machine-learning systems that process sensory input has brought with it a rise in comparisons between human and machine perception. But such comparisons face a challenge: Whereas machine perception of some stimulus can often be probed through direct and explicit measures, much of human perceptual knowledge is latent, incomplete, or unavailable for explicit report. Here, we explore how this asymmetry can cause such comparisons to misestimate the overlap in human and machine perception. As a case study, we consider human perception of \textit{adversarial speech} -- synthetic audio commands that are recognized as valid messages by automated speech-recognition systems but that human listeners reportedly hear as meaningless noise. In five experiments, we adapt task designs from the human psychophysics literature to show that even when subjects cannot freely transcribe such speech commands (the previous benchmark for human understanding), they often can demonstrate other forms of understanding, including discriminating adversarial speech from closely matched non-speech (Experiments 1--2), finishing common phrases begun in adversarial speech (Experiments 3--4), and solving simple math problems posed in adversarial speech (Experiment 5) -- even for stimuli previously described as unintelligible to human listeners. We recommend the adoption of such "sensitive tests" when comparing human and machine perception, and we discuss the broader consequences of such approaches for assessing the overlap between systems.
翻译:处理感官投入的机器-学习系统的兴起导致对人和机器感知的比较上升。但这种比较面临一个挑战:虽然对某种刺激的机器认识往往可以通过直接和明确的措施来调查,但人类的许多感知知识是潜伏的,不完整的,或无法提交明确的报告。在这里,我们探讨这种不对称如何导致这种比较误估人类和机器感知的重叠。作为一个案例研究,我们认为人类对\ textit{ 对抗性言语的认知 -- -- 自动语音识别系统将合成音频指令视为有效信息,但据说人类听众会听到毫无意义的噪音。在五项实验中,我们调整人类心理生理学文献中的任务设计,以表明即使主体无法自由抄写这种言语指令(以前的人类理解基准),它们也往往能够展示其他形式的理解,包括歧视与非感官观点相近的对抗性言论(实验1-2),结束在对抗性言论(实验3-4)中开始的常见的词句,并解决对抗性言论中出现的简单数学问题(实验5) -- 甚至在比较人类敏感感官的测试时,我们所描述的这种感官的测试是非感官的。