This work presents a new fine-grained transparent object segmentation dataset, termed Trans10K-v2, extending Trans10K-v1, the first large-scale transparent object segmentation dataset. Unlike Trans10K-v1 that only has two limited categories, our new dataset has several appealing benefits. (1) It has 11 fine-grained categories of transparent objects, commonly occurring in the human domestic environment, making it more practical for real-world application. (2) Trans10K-v2 brings more challenges for the current advanced segmentation methods than its former version. Furthermore, a novel transformer-based segmentation pipeline termed Trans2Seg is proposed. Firstly, the transformer encoder of Trans2Seg provides the global receptive field in contrast to CNN's local receptive field, which shows excellent advantages over pure CNN architectures. Secondly, by formulating semantic segmentation as a problem of dictionary look-up, we design a set of learnable prototypes as the query of Trans2Seg's transformer decoder, where each prototype learns the statistics of one category in the whole dataset. We benchmark more than 20 recent semantic segmentation methods, demonstrating that Trans2Seg significantly outperforms all the CNN-based methods, showing the proposed algorithm's potential ability to solve transparent object segmentation.
翻译:这项工作提出了一个新的精细的透明对象分割数据集,名为 Trans10K-v2, 扩展 Trans10K-v1, 这是第一个大型透明对象分割数据集。 不同于 Trans10K- v1, 我们的新数据集有两个有限的类别, 有几个吸引人的优点。 (1) 它有11个精细的透明对象类别, 通常发生在人类国内环境中, 使得它更适用于现实世界应用。 (2) Trans10K- v2 给当前先进的分离方法带来了比以前版本更多的挑战。 此外, 提出了一个新的基于变异器的分解管道, 名为 Trans2Seg。 首先, Trans2Seg 的变异器编码器提供了全球可接受域, 与CNN的本地可接受域相比, 显示纯CNN结构的极优势。 第二, 通过将语系分割成一个词典问题, 我们设计了一套可学习的原型原型, 用于查询 Trans2Seg's transadger decoder, 每个原型都学习整个数据天体中一个类别的统计数据。 我们将比20个新的Stragregregres 展示了所有可能的Smanphet- 解算方法, 。