Image set-based visual classification methods have achieved remarkable performance, via characterising the image set in terms of a non-singular covariance matrix on a symmetric positive definite (SPD) manifold. To adapt to complicated visual scenarios better, several Riemannian networks (RiemNets) for SPD matrix nonlinear processing have recently been studied. However, it is pertinent to ask, whether greater accuracy gains can be achieved by simply increasing the depth of RiemNets. The answer appears to be negative, as deeper RiemNets tend to lose generalization ability. To explore a possible solution to this issue, we propose a new architecture for SPD matrix learning. Specifically, to enrich the deep representations, we adopt SPDNet [1] as the backbone, with a stacked Riemannian autoencoder (SRAE) built on the tail. The associated reconstruction error term can make the embedding functions of both SRAE and of each RAE an approximate identity mapping, which helps to prevent the degradation of statistical information. We then insert several residual-like blocks with shortcut connections to augment the representational capacity of SRAE, and to simplify the training of a deeper network. The experimental evidence demonstrates that our DreamNet can achieve improved accuracy with increased depth of the network.
翻译:基于图像的视觉分类方法取得了显著的成绩,通过在对称正确定值(SPD)的对称正数(SPD)矩阵上以非正数共变矩阵来描述所设定的图像。为了更好地适应复杂的视觉情景,最近对几个用于SPD矩阵非线性处理的里曼尼亚网络(RiemNets)进行了研究。然而,应当问,仅仅增加RiemNets的深度是否能提高准确性,能否提高准确性。答案似乎是否定的,因为深处RiemNets往往会失去概括性能力。为了探索这一问题的可能解决办法,我们建议为SPD矩阵学习建立一个新的架构。具体地说,为了丰富深度表达,我们采用了SPDNet[1]作为主干线,在尾部上建立了堆叠叠的里曼尼亚自动编码器(SRAE)。相关的重建错误术语可以使SRAE和每个RAE的近似身份制图功能都能够嵌入,从而帮助防止统计信息退化。我们随后插入了几个类似残缺块,用捷连接来增强SRAE网络的精确性连接,以提升SRAE网络的精确性,并展示我们更深层的网络。