For the last decade, convolutional neural networks (CNNs) have vastly superseded their predecessors in nearly all vision tasks in artificial intelligence, including object recognition. However, despite abundant advancements, they continue to pale in comparison to biological vision. This chasm has prompted the development of biologically-inspired models that have attempted to mimic the human visual system, primarily at a neural level, which is evaluated using standard dataset benchmarks. However, more work is needed to understand how these models perceive the visual world. This article proposes a state-of-the-art procedure that generates a new metric, Psychophysical-Score, which is grounded in visual psychophysics and is capable of reliably estimating perceptual responses across numerous models -- representing a large range in complexity and biological inspiration. We perform the procedure on twelve models that vary in degree of biological inspiration and complexity, we compare the results against the aggregated results of 2,390 Amazon Mechanical Turk workers who together provided ~2.7 million perceptual responses. Each model's Psychophysical-Score is compared against the state-of-the-art neural activity-based metric, Brain-Score. Our study indicates that models with a high correlation to human perceptual behavior also have a high correlation with the corresponding neural activity.
翻译:过去十年来,革命性神经网络(CNNs)在几乎所有人工智能的视觉任务(包括物体识别)中都大大取代了其前身。然而,尽管取得了长足的进步,但它们仍然与生物视觉相比继续黯淡。这种分界线促使了生物启发模型的发展,这些模型试图模仿人类视觉系统,主要是神经系统,这种模型使用标准数据集基准进行评估。然而,需要做更多的工作来了解这些模型如何看待视觉世界。本文章建议采用一个最先进的程序,产生一个新的计量标准,即心理物理核心,以视觉心理物理学为基础,并且能够可靠地估计许多模型的视觉反应 -- -- 代表着大量的复杂性和生物灵感。我们对12个模型进行程序,这些模型在生物灵感和复杂性方面各不相同,我们比较了结果与2,390名亚马逊机械土耳其工人的汇总结果,他们一起提供了大约270万个视觉反应。每个模型的心理物理核心都与基于视觉的神经活动模型的高度相关性比较。我们的研究还表明,这些模型与人类活动的高度相关性。