The wide popularity of digital photography and social networks has generated a rapidly growing volume of multimedia data (i.e., image, music, and video), resulting in a great demand for managing, retrieving, and understanding these data. Affective computing (AC) of these data can help to understand human behaviors and enable wide applications. In this article, we survey the state-of-the-art AC technologies comprehensively for large-scale heterogeneous multimedia data. We begin this survey by introducing the typical emotion representation models from psychology that are widely employed in AC. We briefly describe the available datasets for evaluating AC algorithms. We then summarize and compare the representative methods on AC of different multimedia types, i.e., images, music, videos, and multimodal data, with the focus on both handcrafted features-based methods and deep learning methods. Finally, we discuss some challenges and future directions for multimedia affective computing.
翻译:数字摄影和社交网络广受欢迎,产生了数量迅速增长的多媒体数据(即图像、音乐和视频),从而对管理、检索和理解这些数据产生了巨大需求。这些数据的负面计算(AC)有助于理解人类行为并促成广泛应用。在本篇文章中,我们全面调查用于大规模多元多媒体数据的先进AC技术。我们从这次调查开始,首先介绍在AC广泛使用的心理学典型的情感代表模型。我们简要描述用于评价AC算法的现有数据集。然后我们总结和比较关于不同类型多媒体AC的代表性方法,即图像、音乐、视频和多式数据,重点是手工艺特色方法和深层学习方法。最后,我们讨论多媒体影响计算的一些挑战和未来方向。