Various work has suggested that the memorability of an image is consistent across people, and thus can be treated as an intrinsic property of an image. Using computer vision models, we can make specific predictions about what people will remember or forget. While older work has used now-outdated deep learning architectures to predict image memorability, innovations in the field have given us new techniques to apply to this problem. Here, we propose and evaluate five alternative deep learning models which exploit developments in the field from the last five years, largely the introduction of residual neural networks, which are intended to allow the model to use semantic information in the memorability estimation process. These new models were tested against the prior state of the art with a combined dataset built to optimize both within-category and across-category predictions. Our findings suggest that the key prior memorability network had overstated its generalizability and was overfit on its training set. Our new models outperform this prior model, leading us to conclude that Residual Networks outperform simpler convolutional neural networks in memorability regression. We make our new state-of-the-art model readily available to the research community, allowing memory researchers to make predictions about memorability on a wider range of images.
翻译:各种工作表明,图像的记忆性在人与人之间是一致的,因此可以被视为图像的内在属性。使用计算机视觉模型,我们可以对人们记忆或遗忘的事物做出具体预测。虽然老的工作使用了现在已经过时的深层学习结构来预测图像记忆性,但实地的创新给我们提供了应用这一问题的新技术。在这里,我们提出并评价了五个替代的深层次学习模型,这些模型利用了过去5年的实地发展,主要是引入残余神经网络,目的是让模型在记忆性估计过程中使用语义信息。这些新模型是针对艺术的先前状态进行测试的,用一个综合数据集来优化类别内和跨类的预测。我们的研究结果表明,先前的关键记忆性网络已经高估了其通用性,并且过于适合其培训内容。我们的新模型超越了先前的模式,导致我们得出结论,残余网络在记忆性回归过程中超越了较简单的神经网络。我们让新的状态图像模型能够随时用于社区研究。