In recent years, different types of distributed and parallel learning schemes have received increasing attention for their strong advantages in handling large-scale data information. In the information era, to face the big data challenges {that} stem from functional data analysis very recently, we propose a novel distributed gradient descent functional learning (DGDFL) algorithm to tackle functional data across numerous local machines (processors) in the framework of reproducing kernel Hilbert space. Based on integral operator approaches, we provide the first theoretical understanding of the DGDFL algorithm in many different aspects of the literature. On the way of understanding DGDFL, firstly, a data-based gradient descent functional learning (GDFL) algorithm associated with a single-machine model is proposed and comprehensively studied. Under mild conditions, confidence-based optimal learning rates of DGDFL are obtained without the saturation boundary on the regularity index suffered in previous works in functional regression. We further provide a semi-supervised DGDFL approach to weaken the restriction on the maximal number of local machines to ensure optimal rates. To our best knowledge, the DGDFL provides the first divide-and-conquer iterative training approach to functional learning based on data samples of intrinsically infinite-dimensional random functions (functional covariates) and enriches the methodologies for functional data analysis.
翻译:暂无翻译