Data tensors of orders 3 and greater are routinely being generated. These data collections are increasingly huge and growing. They are either tensor fields (e.g., images, videos, geographic data) in which each location of data contains important information or permutation invariant general tensors (e.g., unsupervised latent space learning, graph network analysis, recommendation systems, etc.). Directly accessing such large data tensor collections for information has become increasingly prohibitive. We learn approximate full-rank and compact tensor sketches with decompositive representations providing compact space, time and spectral embeddings of both tensor fields (P-SCT) and general tensors (P-SCT-Permute). All subsequent information querying with high accuracy is performed on the generative sketches. We produce optimal rank-r Tucker decompositions of arbitrary order data tensors by building tensor sketches from a sample-efficient sub-sampling of tensor slices. Our sample efficient policy is learned via an adaptable stochastic Thompson sampling using Dirichlet distributions with conjugate priors.
翻译:经常生成第3号或第3号等不同程度的数据。这些数据收集越来越庞大,而且日益壮大。它们要么是每个数据位置都包含重要信息或变异性一般变速器(例如未经监督的潜在空间学习、图形网络分析、建议系统等)的重要信息或变异式一般变速器(例如未经监督的潜在空间学习、图形网络分析、建议系统等)。直接获取这类大型数据变速器收集以获取信息已变得越来越令人望而却步。我们学习了近乎全尺寸和紧凑的阵列草图,其分解表情提供了高压场(P-SCT)和一般变速器(P-SCT-Permute)的紧凑空间、时间和光谱嵌入。所有随后的查询信息都在基因化草图上进行。我们通过从高采样效率的索尔切片子小抽样抽样抽样抽样抽样抽样抽样取样,通过使用富集的二重点抽样抽样取样,学习了我们的样本效率政策。