Boundaries are among the primary visual cues used by human and computer vision systems. One of the key problems in boundary detection is the label representation, which typically leads to class imbalance and, as a consequence, to thick boundaries that require non-differential post-processing steps to be thinned. In this paper, we re-interpret boundaries as 1-D surfaces and formulate a one-to-one vector transform function that allows for training of boundary prediction completely avoiding the class imbalance issue. Specifically, we define the boundary representation at any point as the unit vector pointing to the closest boundary surface. Our problem formulation leads to the estimation of direction as well as richer contextual information of the boundary, and, if desired, the availability of zero-pixel thin boundaries also at training time. Our method uses no hyper-parameter in the training loss and a fixed stable hyper-parameter at inference. We provide theoretical justification/discussions of the vector transform representation. We evaluate the proposed loss method using a standard architecture and show the excellent performance over other losses and representations on several datasets. Code is available at https://github.com/edomel/BoundaryVT.
翻译:在人类和计算机视觉系统使用的主要视觉线索中,边界探测的一个关键问题是标签代表,这通常导致阶级不平衡,并因此导致需要缩小非差别式后处理步骤的厚度边界。在本文中,我们将边界重新解释为1-D表面,并制定一个一对一矢量转换功能,以便能够对边界预测进行培训,完全避免阶级不平衡问题。具体地说,我们把边界代表点定义为指向最接近边界表面的单位矢量。我们的问题表述导致对方向的估计以及更丰富的边界背景信息,如果需要的话,在培训时,还可以找到零像素薄度边界。我们的方法在培训损失中不使用超参数,在推断时使用固定稳定的超参数。我们提供理论上的理由/对矢量转换代表的干扰。我们使用标准结构来评估拟议的损失方法,并显示在若干数据集上的其他损失和表现的出色表现。代码可在 https://github.com/edomel/BoundVary查阅 https://github.com/ commel/Bound-Vary查阅。