Functional Principal Component Analysis is a reference method for dimension reduction of curve data. Its theoretical properties are now well understood in the simplified case where the sample curves are fully observed without noise. However, functional data are noisy and necessarily observed on a finite discretization grid. Common practice consists in smoothing the data and then to compute the functional estimates, but the impact of this denoising step on the procedure's statistical performance are rarely considered. Here we prove new convergence rates for functional principal component estimators. We introduce a double asymptotic framework: one corresponding to the sampling size and a second to the size of the grid. We prove that estimates based on projection onto histograms show optimal rates in a minimax sense. Theoretical results are illustrated on simulated data and the method is applied to the visualization of genomic data.
翻译:功能主元件分析是减少曲线数据维度的参考方法,其理论特性现已在抽样曲线完全不受噪音观察到的简化个案中广为人知,但功能性数据很吵,必然在有限的离散网格上观测,常见做法是使数据平滑,然后计算功能性估计,但很少考虑这一分解步骤对程序统计性能的影响。在这里,我们证明功能性主要部分估计者的新趋同率。我们引入了一种双重的零吸附框架:一个与取样大小相对应,另一个与网格大小相对应。我们证明,根据直方图的预测得出的估计数显示最理想的速率,在模拟数据上说明理论结果,该方法用于基因组数据的直观化。