用流数据进行函数线性回归的可缩放推论 (Scalable inference in functional linear regression with streaming data)

Traditional static functional data analysis is facing new challenges due to streaming data, where data constantly flow in. A major challenge is that storing such an ever-increasing amount of data in memory is nearly impossible. In addition, existing inferential tools in online learning are mainly developed for finite-dimensional problems, while inference methods for functional data are focused on the batch learning setting. In this paper, we tackle these issues by developing functional stochastic gradient descent algorithms and proposing an online bootstrap resampling procedure to systematically study the inference problem for functional linear regression. In particular, the proposed estimation and inference procedures use only one pass over the data; thus they are easy to implement and suitable to the situation where data arrive in a streaming manner. Furthermore, we establish the convergence rate as well as the asymptotic distribution of the proposed estimator. Meanwhile, the proposed perturbed estimator from the bootstrap procedure is shown to enjoy the same theoretical properties, which provide the theoretical justification for our online inference tool. As far as we know, this is the first inference result on the functional linear regression model with streaming data. Simulation studies are conducted to investigate the finite-sample performance of the proposed procedure. An application is illustrated with the Beijing multi-site air-quality data.

翻译：传统的静态功能数据分析因数据流流不断流入而面临新的挑战。一个重大挑战是,储存这种数量不断增加的数据几乎不可能在记忆中存储。此外,现有的在线学习推论工具主要是针对有限层面问题开发的,而功能性数据的推论方法则侧重于批量学习设置。在本文件中,我们通过开发功能性随机梯度梯度下沉算法和提出在线靴带取样程序来解决这些问题,以系统研究功能性线性回归的推论问题。特别是,拟议的估算和推论程序只使用一个数据流,因此它们很容易实施,适合数据以流方式到达的情况。此外,我们确定功能性线性回归率以及拟议估算仪的随机分布。与此同时,拟议的靴状梯度测算器也表现出同样的理论属性,为我们的在线推论工具提供了理论依据。据我们所知,这是功能性线性回归模型的第一个推论结果,以流动的方式到达数据到达的数据到达的数据到达了数据流流。我们所展示的固定性数据运行程序。