Image classification is a fundamental application in computer vision. Recently, deeper networks and highly connected networks have shown state of the art performance for image classification tasks. Most datasets these days consist of a finite number of color images. These color images are taken as input in the form of RGB images and classification is done without modifying them. We explore the importance of color spaces and show that color spaces (essentially transformations of original RGB images) can significantly affect classification accuracy. Further, we show that certain classes of images are better represented in particular color spaces and for a dataset with a highly varying number of classes such as CIFAR and Imagenet, using a model that considers multiple color spaces within the same model gives excellent levels of accuracy. Also, we show that such a model, where the input is preprocessed into multiple color spaces simultaneously, needs far fewer parameters to obtain high accuracy for classification. For example, our model with 1.75M parameters significantly outperforms DenseNet 100-12 that has 12M parameters and gives results comparable to Densenet-BC-190-40 that has 25.6M parameters for classification of four competitive image classification datasets namely: CIFAR-10, CIFAR-100, SVHN and Imagenet. Our model essentially takes an RGB image as input, simultaneously converts the image into 7 different color spaces and uses these as inputs to individual densenets. We use small and wide densenets to reduce computation overhead and number of hyperparameters required. We obtain significant improvement on current state of the art results on these datasets as well.
翻译:计算机图像分类是计算机视觉的一个基本应用。 最近, 更深的网络和高度连接的网络已经展示了图像分类任务的最新性能状态。 这些日间大多数数据集由一定数量的彩色图像组成。 这些彩色图像以 RGB 图像的形式作为输入, 而分类则不修改这些图像。 我们探索了彩色空间的重要性, 并显示彩色空间( 原始 RGB 图像的本质转换) 能够显著地影响分类准确性。 此外, 我们显示, 某些类型的图像在特定的彩色空间和具有高度不同种类的数据集, 如 CIFAR 和图像网等。 使用一个模型来考虑同一模型中多个彩色空间的多彩色图像的精确性能。 此外, 我们展示了这样一个模型, 将这些输入预先处理到多个彩色空间中, 远比 1. 75M 参数大大超出 DensenseNet 100-12, 其参数与 Densennet- BC- 190- 40 相近似于25.6M 参数, 用于四种竞争性图像分类模型的分类,, 即: CIFAR- 10, 和 CIFAR- 10 将这些不同的图像输入转换为我们的图像中, 这些图像输入到我们图像- 和 不同图像- IM- IM- 10, 这些图像网络的颜色内输入, 这些图像输入, 以我们的图像- mex- mexal 10 10, 这些图像输入 和 和 和 的精确图解到我们图像网络的深度, 10, 等等。