与树抗拉网络学习:复杂性估计和模式选择 (Learning with tree tensor networks: complexity estimates and model selection)

Tree tensor networks, or tree-based tensor formats, are prominent model classes for the approximation of high-dimensional functions in computational and data science. They correspond to sum-product neural networks with a sparse connectivity associated with a dimension tree and widths given by a tuple of tensor ranks. The approximation power of these models has been proved to be (near to) optimal for classical smoothness classes. However, in an empirical risk minimization framework with a limited number of observations, the dimension tree and ranks should be selected carefully to balance estimation and approximation errors. We propose and analyze a complexity-based model selection method for tree tensor networks in an empirical risk minimization framework and we analyze its performance over a wide range of smoothness classes. Given a family of model classes associated with different trees, ranks, tensor product feature spaces and sparsity patterns for sparse tensor networks, a model is selected (\`a la Barron, Birg\'e, Massart) by minimizing a penalized empirical risk, with a penalty depending on the complexity of the model class and derived from estimates of the metric entropy of tree tensor networks. This choice of penalty yields a risk bound for the selected predictor. In a least-squares setting, after deriving fast rates of convergence of the risk, we show that our strategy is (near to) minimax adaptive to a wide range of smoothness classes including Sobolev or Besov spaces (with isotropic, anisotropic or mixed dominating smoothness) and analytic functions. We discuss the role of sparsity of the tensor network for obtaining optimal performance in several regimes. In practice, the amplitude of the penalty is calibrated with a slope heuristics method. Numerical experiments in a least-squares regression setting illustrate the performance of the strategy.

翻译：树伸缩网络, 或基于树的沙拉格式, 是计算和数据科学中高维功能近距离近似高维值的显著模型级。它们对应于与维度树和宽度相联的松软树和宽度相联的合成产品神经网络。这些模型的近似能量已被证明( 接近) 适合古典光滑类。但是, 在一个实验风险最小化框架内, 观察数量有限, 尺寸树和级应该谨慎选择, 以平衡估计和近似错误。我们提议和分析基于复杂性的树伸缩网络模型选择方法, 在一个实验风险最小化框架中, 我们分析树伸缩网络的复杂模型选择方法, 分析其在一系列的光滑滑动类中的性能。鉴于这些模型类别与不同的树、级、阵列、变色产产品空间和松动模式模式模式, 选择一个模型( la Barron, Birgle, Massart) 尽量减少经验风险, 取决于模型等级的复杂度, 以及从树伸缩网络最小化模型的估算中得出的结果。这种刑罚的变变变变, 系统在快速变变变变变变变的策略中产生一种预测中, 显示的变变变变变变的变的变的变变变的变的变变的变的变的变的变式策略中, 变的变的变的变的变的变的变的变的变的变的变的变的变的变式策略, 。