Research into multi-modal perception, human cognition, behavior, and attention can benefit from high-fidelity content that may recreate real-life-like scenes when rendered on head-mounted displays. Moreover, aspects of audiovisual perception, cognitive processes, and behavior may complement questionnaire-based Quality of Experience (QoE) evaluation of interactive virtual environments. Currently, there is a lack of high-quality open-source audiovisual databases that can be used to evaluate such aspects or systems capable of reproducing high-quality content. With this paper, we provide a publicly available audiovisual database consisting of twelve scenes capturing real-life nature and urban environments with a video resolution of 7680x3840 at 60 frames-per-second and with 4th-order Ambisonics audio. These 360 video sequences, with an average duration of 60 seconds, represent real-life settings for systematically evaluating various dimensions of uni-/multi-modal perception, cognition, behavior, and QoE. The paper provides details of the scene requirements, recording approach, and scene descriptions. The database provides high-quality reference material with a balanced focus on auditory and visual sensory information. The database will be continuously updated with additional scenes and further metadata such as human ratings and saliency information.
翻译:此外,视听认知、认知过程和行为等方面可以补充基于问卷的经验质量(QoE)对互动虚拟环境的评价。目前,缺乏可用于评价能够复制高质量内容的方方面面或系统的高质量开放源视听数据库。有了这份文件,我们提供了一个公开的视听数据库,由12个场景组成,反映真实生活性质和城市环境,视频分辨率为7680x3840,每秒60立方尺,音频为4级。这些360个视频序列,平均持续60秒,代表了系统评价单/多模式认知、认知、行为和QoE等方方面面的真实生活环境。文件提供了现场要求、记录方法和场景描述的细节。数据库提供高质量的参考资料,以平衡的方式进行审计和图像数据基的更新。