Voxel-based 3D object classification has been frequently studied in recent years. The previous methods often directly convert the classic 2D convolution into a 3D form applied to an object with binary voxel representation. In this paper, we investigate the reason why binary voxel representation is not very suitable for 3D convolution and how to simultaneously improve the performance both in accuracy and speed. We show that by giving each voxel a signed distance value, the accuracy will gain about 30% promotion compared with binary voxel representation using a two-layer fully connected network. We then propose a fast fully connected and convolution hybrid cascade network for voxel-based 3D object classification. This threestage cascade network can divide 3D models into three categories: easy, moderate and hard. Consequently, the mean inference time (0.3ms) can speedup about 5x and 2x compared with the state-of-the-art point cloud and voxel based methods respectively, while achieving the highest accuracy in the latter category of methods (92%). Experiments with ModelNet andMNIST verify the performance of the proposed hybrid cascade network.
翻译:近年来经常研究基于福克斯的 3D 对象分类。 以往的方法常常直接将经典 2D 变异转换成3D 格式, 适用于二进制的三进制对象 。 在本文中, 我们调查二进制 voxel 表示不怎么适合 3D 变异, 以及如何同时提高精确度和速度的性能。 我们显示, 通过给每个 voxel 一个签名的距离值, 准确度将比使用两层完全连接的网络的二进制 voxel 表示率提高约30%。 然后, 我们提出一个快速完全连接的三进制混合级联网络, 用于 voxel 3D 对象分类。 这个三阶段级级级联网络可以将 3D 模型分为三类: 简单、中度和硬度。 因此, 平均推导时间 (0. 3ms) 能够分别加快大约 5x 2x 和 2x 的速率, 与最先进的点云和基于 voxel 的方法相比, 达到后一类方法的最高精确度( 92% ) 。 。 与模型的实验 和MNIST 核实了拟议的混合 。