Investigating how people perceive virtual reality videos in the wild (\ie, those captured by everyday users) is a crucial and challenging task in VR-related applications due to complex \textit{authentic} distortions localized in space and time. Existing panoramic video databases only consider synthetic distortions, assume fixed viewing conditions, and are limited in size. To overcome these shortcomings, we construct the VR Video Quality in the Wild (VRVQW) database, which is one of the first of its kind, and contains $502$ user-generated videos with diverse content and distortion characteristics. Based on VRVQW, we conduct a formal psychophysical experiment to record the scanpaths and perceived quality scores from $139$ participants under two different viewing conditions. We provide a thorough statistical analysis of the recorded data, observing significant impact of viewing conditions on both human scanpaths and perceived quality. Moreover, we develop an objective quality assessment model for VR videos based on pseudocylindrical representation and convolution. Results on the proposed VRVQW show that our method is superior to existing video quality assessment models, only underperforming viewport-based models that otherwise rely on human scanpaths for projection. Last, we explore the additional use of the VRVQW dataset to benchmark saliency detection techniques, highlighting the need for further research. We have made the database and code available at \url{https://github.com/limuhit/VR-Video-Quality-in-the-Wild}.
翻译:暂无翻译