To automatically localize a target object in an image is crucial for many computer vision applications. Recently ellipse representations have been identified as an alternative to axis-aligned bounding boxes for object localization. This paper considers 3D-aware ellipse labels, i.e., which are projections of a 3D ellipsoidal approximation of the object in the images for 2D target localization. Such generic ellipsoidal models allow for handling coarsely known targets, and 3D-aware ellipse detections carry more geometric information about the object than traditional 3D-agnostic bounding box labels. We propose to have a new look at ellipse regression and replace the geometric ellipse parameters with the parameters of an implicit Gaussian distribution encoding object occupancy in the image. The models are trained to regress the values of this bivariate Gaussian distribution over the image pixels using a continuous statistical loss function. We introduce a novel non-trainable differentiable layer, E-DSNT, to extract the distribution parameters. Also, we describe how to readily generate consistent 3D-aware Gaussian occupancy parameters using only coarse dimensions of the target and relative pose labels. We extend three existing spacecraft pose estimation datasets with 3D-aware Gaussian occupancy labels to validate our hypothesis.
翻译:将图像中的目标对象自动本地化对于许多计算机视觉应用来说至关重要。 最近, 椭圆表示法已被确定为是对象定位轴比对接框的替代方。 本文考虑了 3D-aware 椭圆标签, 即2D 目标本地化图像中对象的三维垂直近似值。 这种通用的双线模型允许处理粗略已知目标, 而 3D- 可见的椭圆探测仪的几何信息比传统的 3D 高级约束框标签要多一些。 我们提议重新审视椭圆回归, 用图像中隐含的高斯分布编码对象占用参数来取代几何等椭圆参数。 模型经过训练, 使用连续的统计损失功能, 可以在图像像像像标上反射双向分布值。 我们引入了一种新的非可测量差异层, E- DSNT, 以提取分配参数。 此外, 我们描述如何以连续的 3DASA 类比对当前空间结构的假设值生成 3DAS 。 我们只用现有3 标度的比标的比值 。</s>