变化等同如何影响变化的不平等性对实例分化的计量学习 (How Shift Equivariance Impacts Metric Learning for Instance Segmentation)

Metric learning has received conflicting assessments concerning its suitability for solving instance segmentation tasks. It has been dismissed as theoretically flawed due to the shift equivariance of the employed CNNs and their respective inability to distinguish same-looking objects. Yet it has been shown to yield state of the art results for a variety of tasks, and practical issues have mainly been reported in the context of tile-and-stitch approaches, where discontinuities at tile boundaries have been observed. To date, neither of the reported issues have undergone thorough formal analysis. In our work, we contribute a comprehensive formal analysis of the shift equivariance properties of encoder-decoder-style CNNs, which yields a clear picture of what can and cannot be achieved with metric learning in the face of same-looking objects. In particular, we prove that a standard encoder-decoder network that takes $d$-dimensional images as input, with $l$ pooling layers and pooling factor $f$, has the capacity to distinguish at most $f^{dl}$ same-looking objects, and we show that this upper limit can be reached. Furthermore, we show that to avoid discontinuities in a tile-and-stitch approach, assuming standard batch size 1, it is necessary to employ valid convolutions in combination with a training output window size strictly greater than $f^l$, while at test-time it is necessary to crop tiles to size $n\cdot f^l$ before stitching, with $n\geq 1$. We complement these theoretical findings by discussing a number of insightful special cases for which we show empirical results on synthetic data.

翻译：计量学习在是否适合解决实例分割任务方面得到了相互矛盾的评估,但由于雇用的有线电视新闻网的变换变化以及它们各自无法区分相同对象,因此在理论上存在缺陷,因此被驳斥为理论缺陷。然而,事实证明,在各种任务中产生了最新的最新结果,而实际问题主要是在瓷砖和螺丝方法的背景下报告的,在瓷砖边界上出现了不连续现象。迄今为止,报告的问题都没有经过彻底的正式分析。在我们的工作中,我们对编码-脱coder-风格的有线电视新闻网的变换性进行了全面的正式分析,从而可以清楚地了解在面对相同对象时,光学能够和无法取得什么成就。特别是,我们证明一个标准的编码-脱coder网络以美元作为输入,用美元集中层和集中因子美元,能够将大部分美元作为我们所报告的问题进行彻底分析。同样看起来的物体,我们展示了这种特殊的上限可以达到。此外,我们假设,在以美元标准-成本模型的模型中,要用比标准-xxxxx 来避免数字的图像的缩组合。

相关内容

度量学习

关注 3372

度量学习的目的为了衡量样本之间的相近程度，而这也正是模式识别的核心问题之一。大量的机器学习方法，比如K近邻、支持向量机、径向基函数网络等分类方法以及K-means聚类方法，还有一些基于图的方法，其性能好坏都主要有样本之间的相似度量方法的选择决定。度量学习通常的目标是使同类样本之间的距离尽可能缩小，不同类样本之间的距离尽可能放大。

深度卷积神经网络图像语义分割研究进展

专知会员服务

86+阅读 · 2021年1月7日

【快讯】ICML 2020论文出炉，1088篇上榜，你的paper中了吗？

专知会员服务

52+阅读 · 2020年6月1日

深度学习搜索，Exploring Deep Learning for Search

专知会员服务

61+阅读 · 2020年5月9日

学习具有层次标签的图像表示，Learning Representations For Images With Hierarchical Labels

专知会员服务

38+阅读 · 2020年4月6日