We propose an approach to instance segmentation from 3D point clouds based on dynamic convolution. This enables it to adapt, at inference, to varying feature and object scales. Doing so avoids some pitfalls of bottom up approaches, including a dependence on hyper-parameter tuning and heuristic post-processing pipelines to compensate for the inevitable variability in object sizes, even within a single scene. The representation capability of the network is greatly improved by gathering homogeneous points that have identical semantic categories and close votes for the geometric centroids. Instances are then decoded via several simple convolution layers, where the parameters are generated conditioned on the input. The proposed approach is proposal-free, and instead exploits a convolution process that adapts to the spatial and semantic characteristics of each instance. A light-weight transformer, built on the bottleneck layer, allows the model to capture long-range dependencies, with limited computational overhead. The result is a simple, efficient, and robust approach that yields strong performance on various datasets: ScanNetV2, S3DIS, and PartNet. The consistent improvements on both voxel- and point-based architectures imply the effectiveness of the proposed method. Code is available at: https://git.io/DyCo3D
翻译:我们提议了一种基于动态共变的3D点云云分化实例的方法。 这使得它能够根据推论, 适应不同的特性和对象尺度。 这样做避免了自下而上方法的一些陷阱, 包括依赖超参数调制和超湿化的后处理管道来补偿物体大小的不可避免的变异性, 甚至在单一场景内也是如此。 通过收集具有相同语义分类的同质点和对几何类固态的近票, 网络的表达能力得到极大改善。 然后通过几个简单的共变层对实例进行解码, 其中生成的参数以输入为条件。 提议的方法是无建议性的, 而不是利用适应每个实例的空间和语义特性的共变过程。 以瓶层为基础的轻量变压器使得模型能够捕捉到长距离的依赖性, 且计算性能有限。 结果是简单、 高效和稳健的方法, 使各种数据集产生强的性能: 扫描NetV2、 S3DIS和 PartNet。 提议的方法是无提案的, 显示软件/ D 的系统 的一致的系统。