Learning from unlabeled or partially labeled data to alleviate human labeling remains a challenging research topic in 3D modeling. Along this line, unsupervised representation learning is a promising direction to auto-extract features without human intervention. This paper proposes a general unsupervised approach, named \textbf{ConClu}, to perform the learning of point-wise and global features by jointly leveraging point-level clustering and instance-level contrasting. Specifically, for one thing, we design an Expectation-Maximization (EM) like soft clustering algorithm that provides local supervision to extract discriminating local features based on optimal transport. We show that this criterion extends standard cross-entropy minimization to an optimal transport problem, which we solve efficiently using a fast variant of the Sinkhorn-Knopp algorithm. For another, we provide an instance-level contrasting method to learn the global geometry, which is formulated by maximizing the similarity between two augmentations of one point cloud. Experimental evaluations on downstream applications such as 3D object classification and semantic segmentation demonstrate the effectiveness of our framework and show that it can outperform state-of-the-art techniques.
翻译:从未贴标签或部分标签的数据中学习以缓减人类标签,这仍然是3D模型中一项具有挑战性的研究课题。一线上,未经监督的代言学习是无人类干预的自动提取特性的一个很有希望的方向。本文提出一种一般性的未经监督的方法,名为\ textbf{ConClu},通过联合利用点级集聚和实例级对比来学习点和全球特征。具体地说,我们设计了一个期望-最大化(EM),类似于软集成算法,它提供本地监督,以根据最佳运输方式提取歧视性的地方特征。我们表明,这一标准将标准的跨热带最小化扩展到一个最佳运输问题,我们使用Sinkhorn-Knopp算法的快速变量有效地解决了这一问题。另一方面,我们提供了一种实例级对比方法,以学习全球几何方法,这是通过尽可能扩大一个点云的两种增益之间的相似性来制定的。对下游应用的实验性评估,例如3D对象分类和语义分分割,显示了我们框架的有效性,并表明它能够超越状态技术。