We study the problem of preconditioning in sequential prediction. From the theoretical lens of linear dynamical systems, we show that convolving the input sequence corresponds to applying a polynomial to the hidden transition matrix. Building on this insight, we propose a universal preconditioning method that convolves the input with coefficients from orthogonal polynomials such as Chebyshev or Legendre. We prove that this approach reduces regret for two distinct prediction algorithms and yields the first ever sublinear and hidden-dimension independent regret bounds (up to logarithmic factors) that hold for systems with marginally stable and asymmetric transition matrices. Finally, extensive synthetic and real-world experiments show that this simple preconditioning strategy improves the performance of a diverse range of algorithms, including recurrent neural networks, and generalizes to signals beyond linear dynamical systems.
翻译:我们研究序列预测中的预处理问题。从线性动力系统的理论视角出发,我们证明对输入序列进行卷积相当于对隐藏转移矩阵应用多项式运算。基于这一洞见,我们提出一种通用预处理方法,该方法使用正交多项式(如切比雪夫多项式或勒让德多项式)的系数对输入进行卷积。我们证明该方法能降低两种不同预测算法的遗憾值,并首次获得(在对数因子范围内)适用于边缘稳定且非对称转移矩阵系统的次线性且与隐藏维度无关的遗憾界。最后,大量合成与真实世界实验表明,这种简单的预处理策略能提升包括循环神经网络在内的多种算法性能,并可推广至线性动力系统之外的信号类型。