The visual quality of point clouds has been greatly emphasized since the ever-increasing 3D vision applications are expected to provide cost-effective and high-quality experiences for users. Looking back on the development of point cloud quality assessment (PCQA) methods, the visual quality is usually evaluated by utilizing single-modal information, i.e., either extracted from the 2D projections or 3D point cloud. The 2D projections contain rich texture and semantic information but are highly dependent on viewpoints, while the 3D point clouds are more sensitive to geometry distortions and invariant to viewpoints. Therefore, to leverage the advantages of both point cloud and projected image modalities, we propose a novel no-reference point cloud quality assessment (NR-PCQA) metric in a multi-modal fashion. In specific, we split the point clouds into sub-models to represent local geometry distortions such as point shift and down-sampling. Then we render the point clouds into 2D image projections for texture feature extraction. To achieve the goals, the sub-models and projected images are encoded with point-based and image-based neural networks. Finally, symmetric cross-modal attention is employed to fuse multi-modal quality-aware information. Experimental results show that our approach outperforms all compared state-of-the-art methods and is far ahead of previous NR-PCQA methods, which highlights the effectiveness of the proposed method.
翻译:点云的视觉质量受到极大重视,因为不断增加的 3D 视觉应用预期会为用户提供具有成本效益和高质量的经验。回顾点云质量评估方法的发展,视觉质量通常通过使用单一模式信息来评估,即从 2D 预测中提取,或从 3D 点云中提取。 2D 预测包含丰富的纹理和语义信息,但高度依赖观点,而3D 点云则对几何扭曲更为敏感,对观点也更加不易变。因此,为了利用点云和预测图像模式的优势,我们建议以多模式方式进行新的零点云质量评估(NR-PCQA) 。具体地说,我们将点云分为子模型,以代表先前的地貌扭曲,如点转换和下标。然后,我们将点云变成2D 图像预测,用于提取图案特征。为了实现目标,子模型和预测图像是用基于点和基于图像的图像的云质量评估方法编码的。最后,我们采用的是模型式的跨度方法,用来显示远方程式方法。