Recently, Zhang et al. (2018) proposed an interesting model of attention guidance that uses visual features learnt by convolutional neural networks for object recognition. I adapted this model for search experiments with accuracy as the measure of performance. Simulation of our previously published feature and conjunction search experiments revealed that CNN-based search model considerably underestimates human attention guidance by simple visual features. A simple explanation is that the model has no bottom-up guidance of attention. Another view might be that standard CNNs do not learn features required for human-like attention guidance.
翻译:最近,张等人(2018年)提出了一个有趣的关注指导模式,该模式使用进化神经网络所学的视觉特征来识别物体。我对这个模式进行了精确的搜索实验,以此作为性能的衡量标准。模拟我们以前出版的特征和组合搜索实验表明,基于CNN的搜索模式通过简单的视觉特征大大低估了人类关注指导。一个简单的解释是,该模式没有自下而上的关注指导。另一种观点可能是,标准的CNN没有学习到像人一样的关注指导所需要的特征。