The tensor-train (TT) format is a data-sparse tensor representation commonly used in high dimensional function approximations arising from computational and data sciences. Various sequential and parallel TT decomposition algorithms have been proposed for different tensor inputs and assumptions. In this paper, we propose subtensor parallel adaptive TT cross, which partitions a tensor onto distributed memory machines with multidimensional process grids, and constructs an TT approximation iteratively with tensor elements. We derive two iterative formulations for pivot selection and TT core construction under the distributed memory setting, conduct communication and scaling analysis of the algorithm, and illustrate its performance with multiple test experiments. These include up to 6D Hilbert tensors and tensors constructed from Maxwellian distribution functions that arise in kinetic theory. Our results demonstrate significant accuracy with greatly reduced storage requirements via the TT cross approximation. Furthermore, we demonstrate good to optimal strong and weak scaling performance for the proposed parallel algorithm.
翻译:暂无翻译