Developing deep networks that analyze fashion garments has many real-world applications. Among all fashion attributes, color is one of the most important yet challenging to detect. Existing approaches are classification-based and thus cannot go beyond the list of discrete predefined color names. In this paper, we handle color detection as a regression problem to predict the exact RGB values. That's why in addition to a first color classifier, we include a second regression stage for refinement in our newly proposed architecture. This second step combines two attention models: the first depends on the type of clothing, the second depends on the color previously detected by the classifier. Our final prediction is the weighted spatial pooling over the image pixels RGB values, where the illumination has been corrected. This architecture is modular and easily expanded to detect the RGBs of all colors in a multicolor garment. In our experiments, we show the benefits of each component of our architecture.
翻译:开发分析时装服装的深层网络有许多真实世界应用。 在所有时装属性中, 颜色是最重要的但最难探测的特征之一 。 现有方法基于分类, 因此无法超越离散预定义的颜色名称列表 。 在本文中, 我们处理颜色检测作为回归问题来预测准确的 RGB 值。 这就是为什么除了第一个颜色分类器之外, 我们还包括了改进我们新提议的架构的第二个回归阶段。 第二步结合了两种关注模型: 第一个取决于服装类型, 第二个取决于分类者先前检测到的颜色 。 我们的最后预测是图像像素 RGB 值的加权空间集合, 这里的污染已经纠正了 。 这个结构是模块化的, 并且很容易扩大, 以检测多色服装中所有颜色的 RGB 。 我们实验中显示了我们结构中每个组成部分的好处 。