In the training of over-parameterized model functions via gradient descent, sometimes the parameters do not change significantly and remain close to their initial values. This phenomenon is called lazy training, and motivates consideration of the linear approximation of the model function around the initial parameters. In the lazy regime, this linear approximation imitates the behavior of the parameterized function whose associated kernel, called the tangent kernel, specifies the training performance of the model. Lazy training is known to occur in the case of (classical) neural networks with large widths. In this paper, we show that the training of geometrically local parameterized quantum circuits enters the lazy regime for large numbers of qubits. More precisely, we prove bounds on the rate of changes of the parameters of such a geometrically local parameterized quantum circuit in the training process, and on the precision of the linear approximation of the associated quantum model function; both of these bounds tend to zero as the number of qubits grows. We support our analytic results with numerical simulations.
翻译:在通过梯度下降训练过参数化的模型函数时,有时参数不会发生显著变化,而是保持接近其初始值。这种现象被称为懒惰训练,促使我们考虑模型函数在初始参数周围的线性近似。在懒惰区域内,该线性近似模仿参数化函数的行为,其关联核称为切向核,指定了模型的训练性能。已知对于具有大宽度的(经典)神经网络存在懒惰训练现象。在本文中,我们证明了在量子比特数量很大时进行地理局部参数化量子电路的训练进入到了懒惰区域。更具体地说,我们证明了这样的地理局部参数化量子电路在训练过程中的参数变化率和相关量子模型函数的线性近似精度上都存在界限。这两种界限随着量子比特数的增加而趋近于零。我们通过数值模拟支持我们的分析结果。