Computational inference of aesthetics is an ill-defined task due to its subjective nature. Many datasets have been proposed to tackle the problem by providing pairs of images and aesthetic scores based on human ratings. However, humans are better at expressing their opinion, taste, and emotions by means of language rather than summarizing them in a single number. In fact, photo critiques provide much richer information as they reveal how and why users rate the aesthetics of visual stimuli. In this regard, we propose the Reddit Photo Critique Dataset (RPCD), which contains tuples of image and photo critiques. RPCD consists of 74K images and 220K comments and is collected from a Reddit community used by hobbyists and professional photographers to improve their photography skills by leveraging constructive community feedback. The proposed dataset differs from previous aesthetics datasets mainly in three aspects, namely (i) the large scale of the dataset and the extension of the comments criticizing different aspects of the image, (ii) it contains mostly UltraHD images, and (iii) it can easily be extended to new data as it is collected through an automatic pipeline. To the best of our knowledge, in this work, we propose the first attempt to estimate the aesthetic quality of visual stimuli from the critiques. To this end, we exploit the polarity of the sentiment of criticism as an indicator of aesthetic judgment. We demonstrate how sentiment polarity correlates positively with the aesthetic judgment available for two aesthetic assessment benchmarks. Finally, we experiment with several models by using the sentiment scores as a target for ranking images. Dataset and baselines are available (https://github.com/mediatechnologycenter/aestheval).
翻译:由于审美的主观性质,许多数据集被提出来通过提供图像和根据人类评级的美学评分来解决这个问题。然而,人类通过语言表达自己的观点、品味和情绪,而不是用一个单一的数字来总结。事实上,摄影评论提供了更丰富的信息,因为它们揭示了用户如何和为什么对视觉模拟的美学进行评分。在这方面,我们提议了Reddid 照片精度数据集(RPCD),其中载有图像和照片直观的图象。RPCD由74K图像和220K评论组成,从业余爱好者和专业摄影师使用的Redit社区收集,通过利用建设性的社区反馈来提高他们的摄影技能。拟议的数据集与以前的美学数据集主要在三个方面不同,即(一) 数据集的庞大规模和对图像不同方面的评论的延伸,(二) 它包含着图像和照片的尖锐度。 (三) 它可以很容易地将实验性评分数的评分值包含图像和照片直径。