Convolutional Networks have dominated the field of computer vision for the last ten years, exhibiting extremely powerful feature extraction capabilities and outstanding classification performance. The main strategy to prolong this trend relies on further upscaling networks in size. However, costs increase rapidly while performance improvements may be marginal. We hypothesise that adding heterogeneous sources of information may be more cost-effective to a CNN than building a bigger network. In this paper, an ensemble method is proposed for accurate image classification, fusing automatically detected features through Convolutional Neural Network architectures with a set of manually defined statistical indicators. Through a combination of the predictions of a CNN and a secondary classifier trained on statistical features, better classification performance can be cheaply achieved. We test multiple learning algorithms and CNN architectures on a diverse number of datasets to validate our proposal, making public all our code and data via GitHub. According to our results, the inclusion of additional indicators and an ensemble classification approach helps to increase the performance in 8 of 9 datasets, with a remarkable increase of more than 10% precision in two of them.
翻译:过去十年来,连锁网络在计算机视野领域占据了主导地位,展示了极强的地物提取能力和杰出的分类性能。延长这一趋势的主要战略依赖于进一步扩大网络的规模。然而,成本在提高绩效的同时会迅速上升。我们假设增加多种信息来源对有线电视新闻网可能比建立一个更大的网络更具成本效益。在本文中,提出了一种组合方法,用于准确图像分类,通过有人工定义的统计指标集,通过Convolution Neural网络结构自动生成检测到的特征。通过将CNN和受过统计特征培训的二级分类师的预测结合起来,可以廉价地实现更好的分类性能。我们测试了多种学习算法和CNN结构来验证我们的提案,通过GitHub公布我们的所有代码和数据。根据我们的结果,增加的指标和混合分类方法有助于增加8个数据集的性能,其中两个数据集的精确度明显提高10%以上。