We propose an end-to-end trainable framework that processes large-scale visual data tensors by looking at a fraction of their entries only. Our method combines a neural network encoder with a tensor train decomposition to learn a low-rank latent encoding, coupled with cross-approximation (CA) to learn the representation through a subset of the original samples. CA is an adaptive sampling algorithm that is native to tensor decompositions and avoids working with the full high-resolution data explicitly. Instead, it actively selects local representative samples that we fetch out-of-core and on-demand. The required number of samples grows only logarithmically with the size of the input. Our implicit representation of the tensor in the network enables processing large grids that could not be otherwise tractable in their uncompressed form. The proposed approach is particularly useful for large-scale multidimensional grid data (e.g., 3D tomography), and for tasks that require context over a large receptive field (e.g., predicting the medical condition of entire organs). The code is available at https://github.com/aelphy/c-pic.
翻译:我们建议一个端到端的可训练框架,通过只查看部分条目来处理大规模视觉数据分解。 我们的方法将神经网络编码器和电压列分解器结合起来, 学习低级潜伏编码, 加上交叉匹配(CA) 来通过原始样本的子集来学习表示。 CA是一种适应性抽样算法, 土生土长, 土生土长, 并避免使用全部高分辨率数据。 相反, 它积极选择从核心和需求中提取的具有地方代表性的样本。 所需的样本数量随着输入量的大小而不断对数增长。 我们在网络中隐含的变色显示使得能够处理大网格, 而这些大网格无法以非压缩的形式被拉动。 提议的方法对于大型的多维格数据( 例如 3Dtomography) 特别有用, 并且对于需要大可接受性域域域域( 如预测整个器官的医疗状况) 的任务特别有用。 代码可在 https://giphy. /picelc.