Image aesthetic quality assessment is popular during the last decade. Besides numerical assessment, nature language assessment (aesthetic captioning) has been proposed to describe the generally aesthetic impression of an image. In this paper, we propose aesthetic attribute assessment, which is the aesthetic attributes captioning, i.e., to assess the aesthetic attributes such as composition, lighting usage and color arrangement. It is a non-trivial task to label the comments of aesthetic attributes, which limit the scale of the corresponding datasets. We construct a novel dataset, named DPC-CaptionsV2, by a semi-automatic way. The knowledge is transferred from a small-scale dataset with full annotations to large-scale professional comments from a photography website. Images of DPC-CaptionsV2 contain comments up to 4 aesthetic attributes: composition, lighting, color, and subject. Then, we propose a new version of Aesthetic Multi-Attributes Networks (AMANv2) based on the BUTD model and the VLPSA model. AMANv2 fuses features of a mixture of small-scale PCCD dataset with full annotations and large-scale DPCCaptionsV2 dataset with full annotations. The experimental results of DPCCaptionsV2 show that our method can predict the comments on 4 aesthetic attributes, which are closer to aesthetic topics than those produced by the previous AMAN model. Through the evaluation criteria of image captioning, the specially designed AMANv2 model is better to the CNN-LSTM model and the AMAN model.
翻译:在过去十年中,图像质量评估很受欢迎。 除了数字评估外, 自然语言评估( 美学说明) 也提议用半自动方式来描述一个图像的总体美学印象。 在本文中, 我们提议用美学属性评估, 即美学属性说明, 即评估成份、 照明使用和颜色安排等美学属性。 这是一项非三重任务, 标注美学属性的评论, 限制相应数据集的规模。 我们用半自动方式构建了一个名为 DPC- CaptionsV2 的新数据集。 知识是从一个带有完整说明的小规模数据集传输到一个摄影网站的大规模专业评论。 DPC- CaptionSV2 图像包含最多4个美学属性的评论: 成份、 照明、 颜色和主题。 然后, 我们提出一个新版本的美学多元属性网络( AMANV2) 。 我们用BITD模式和 VLPSA模型来构建一个更精细的模型化的混合物。