Field of view (FoV) prediction is critical in 360-degree video multicast, which is a key component of the emerging Virtual Reality (VR) and Augmented Reality (AR) applications. Most of the current prediction methods combining saliency detection and FoV information neither take into account that the distortion of projected 360-degree videos can invalidate the weight sharing of traditional convolutional networks, nor do they adequately consider the difficulty of obtaining complete multi-user FoV information, which degrades the prediction performance. This paper proposes a spherical convolution-empowered FoV prediction method, which is a multi-source prediction framework combining salient features extracted from 360-degree video with limited FoV feedback information. A spherical convolution neural network (CNN) is used instead of a traditional two-dimensional CNN to eliminate the problem of weight sharing failure caused by video projection distortion. Specifically, salient spatial-temporal features are extracted through a spherical convolution-based saliency detection model, after which the limited feedback FoV information is represented as a time-series model based on a spherical convolution-empowered gated recurrent unit network. Finally, the extracted salient video features are combined to predict future user FoVs. The experimental results show that the performance of the proposed method is better than other prediction methods.
翻译:在360度视频多播中,视觉场(FoV)的预测至关重要,这是新兴虚拟现实(VR)和增强现实(AR)应用的关键组成部分,目前将显著探测和FoV信息相结合的预测方法大多没有考虑到预测360度视频的扭曲可能使传统革命网络的权重共享失去意义,也没有充分考虑到获得完整的多用户FoV信息的困难,从而降低预测性能。本文建议采用球形共振动力FoV预测方法,这是一个多来源预测框架,将360度视频的突出特征与有限的FoV反馈信息相结合。使用球形共振神经网络而不是传统的二维CNN,以消除视频投影扭曲造成的重量共享故障问题。具体地说,突出的空间时空特征是通过基于球状的子级显著检测模型提取的,此后,有限的反馈FoV信息作为基于球质共振动视频视频视频视频的时序模型,以有限的FoV视频视频视频反馈信息为基础,以有限的FoV反馈信息作为组合式反馈信息反馈信息,而软件反馈的反馈信息反馈信息反馈信息反馈信息反馈信息反馈信息,其FoVVVVV的反馈信息反馈信息反馈信息反馈信息反馈信息信息,其反馈信息反馈信息反馈信息反馈信息反馈信息,其反馈信息反馈信息反馈信息反馈信息,其反馈信息,其反馈信息反馈信息,其反馈信息,其反馈信息反馈信息,其反馈信息,其反馈信息,其反馈信息信息信息,其反馈信息,其反馈信息,其结果,其结果,其结果,其结果,其结果,其结果,其结果,其结果,其结果,其结果,其结果,最终,其结果,其结果,其结果,最终,其结果,其结果,其方法,最终。