The aesthetic quality of an image is defined as the measure or appreciation of the beauty of an image. Aesthetics is inherently a subjective property but there are certain factors that influence it such as, the semantic content of the image, the attributes describing the artistic aspect, the photographic setup used for the shot, etc. In this paper we propose a method for the automatic prediction of the aesthetics of an image that is based on the analysis of the semantic content, the artistic style and the composition of the image. The proposed network includes: a pre-trained network for semantic features extraction (the Backbone); a Multi Layer Perceptron (MLP) network that relies on the Backbone features for the prediction of image attributes (the AttributeNet); a self-adaptive Hypernetwork that exploits the attributes prior encoded into the embedding generated by the AttributeNet to predict the parameters of the target network dedicated to aesthetic estimation (the AestheticNet). Given an image, the proposed multi-network is able to predict: style and composition attributes, and aesthetic score distribution. Results on three benchmark datasets demonstrate the effectiveness of the proposed method, while the ablation study gives a better understanding of the proposed network.
翻译:图像的审美质量被定义为图像美观的测量或欣赏。 审美本质上是一个主观属性, 但有某些影响它的因素, 例如图像的语义内容、 描述艺术方面的属性、 拍摄时使用的照片设置等。 在本文中, 我们提出一个基于对语义内容、 艺术风格和图像构成的分析, 自动预测图像美观的方法。 拟议的网络包括: 一个预培训的语义特征提取网络( 后骨); 多层 Percepron ( MLP) 网络, 依靠后骨特征来预测图像属性( 属性网); 一个自我适应的超网络, 利用属性网生成的嵌入之前的属性来预测专门进行审美评估的目标网络( 审美网络 ) 的参数 。 根据图像, 拟议的多网络能够预测: 风格和构成属性, 以及美学分分布 。 三个基准数据集的研究结果展示了所拟议的网络方法的有效性, 同时提出了更好的联系方法。