The recently introduced locally orderless tensor network (LoTeNet) for supervised image classification uses matrix product state (MPS) operations on grids of transformed image patches. The resulting patch representations are combined back together into the image space and aggregated hierarchically using multiple MPS blocks per layer to obtain the final decision rules. In this work, we propose a non-patch based modification to LoTeNet that performs one MPS operation per layer, instead of several patch-level operations. The spatial information in the input images to MPS blocks at each layer is squeezed into the feature dimension, similar to LoTeNet, to maximise retained spatial correlation between pixels when images are flattened into 1D vectors. The proposed multi-layered tensor network (MLTN) is capable of learning linear decision boundaries in high dimensional spaces in a multi-layered setting, which results in a reduction in the computation cost compared to LoTeNet without any degradation in performance.
翻译:最近为监督图像分类而引入的本地无源高压网络(LoTeNet)在变形图像补丁网格上使用矩阵产品状态(MPS)操作。由此产生的补丁表示法被合并到图像空间中,并使用每个层的多个 MPS 区块进行分级汇总,以获得最终决定规则。在这项工作中,我们建议对LoTeNet进行非批量式的修改,该修改法对每个层进行一次MPS操作,而不是若干个补丁级操作。每个层的 MPS 区块输入图像中的空间信息被挤入特性层面,类似于LoTeNet,以便在图像被压入1D矢量器时,使像素之间保持的空间相关性最大化。拟议的多层沙尔网络能够在多层环境中学习高维空间的线性决定界限,从而降低与LoTeNet相比的计算成本,而不会造成性能退化。