360-degree streaming videos can provide a rich immersive experiences to the users. However, it requires an extremely high bandwidth network. One of the common solutions for saving bandwidth consumption is to stream only a portion of video covered by the user's viewport. To do that, the user's viewpoint prediction is indispensable. In existing viewport prediction methods, they mainly concentrate on the user's head movement trajectory and video saliency. None of them consider navigation information contained in the video, which can turn the attention of the user to specific regions in the video with high probability. Such information can be included in video subtitles, especially the one in 360-degree virtual tourism videos. This fact reveals the potential contribution of video subtitles to viewport prediction. Therefore, in this paper, a subtitle-based viewport prediction model for 360-degree virtual tourism videos is proposed. This model leverages the navigation information in the video subtitles in addition to head movement trajectory and video saliency, to improve the prediction accuracy. The experimental results demonstrate that the proposed model outperforms baseline methods which only use head movement trajectory and video saliency for viewport prediction.
翻译:360度流动视频可以向用户提供丰富的亲身体验。 但是,它需要一个极高的带宽网络。 节省带宽消费的常见解决方案之一是只流流用户浏览门户所覆盖的部分视频。 要做到这一点,用户的观点预测是必不可少的。 在现有的浏览门户预测方法中,这些视频主要集中于用户头部运动轨迹和视频突出度。 没有一个视频考虑视频中的导航信息,这些信息可以将用户的注意力转向视频中的特定区域,概率很高。 这些信息可以包含在视频字幕中, 特别是360度虚拟旅游视频中的视频。 这一事实显示了视频字幕对观看门户预测的潜在贡献。 因此, 在本文中, 提出了一个360度虚拟旅游视频视频视频以字幕为基础的视图预测模型。 该模型除了利用视频字幕中的导航信息来提高视频的准确性外, 还将利用视频字幕中的导航信息来引导移动轨迹和视频突出度, 实验结果显示, 拟议的模型将超出基线方法的完善, 仅使用头部移动轨迹和视频突出度来进行视图预测。