Recent advances in foundation models have shown great promise in domains such as natural language processing and computer vision, and similar efforts are now emerging in the Earth Observation community. These models aim to generalize across tasks with limited supervision, reducing the need for training separate models for each task. However, current strategies, which largely focus on scaling model size and dataset volume, require prohibitive computational and data resources, limiting accessibility to only a few large institutions. Moreover, this paradigm of ever-larger models stands in stark contrast with the principles of sustainable and environmentally responsible AI, as it leads to immense carbon footprints and resource inefficiency. In this work, we present a novel and efficient alternative: an Ensemble-of-Specialists framework for building Remote Sensing Foundation Models (RSFMs). Our method decomposes the training process into lightweight, task-specific ConvNeXtV2 specialists that can be frozen and reused. This modular approach offers strong advantages in efficiency, interpretability, and extensibility. Moreover, it naturally supports federated training, pruning, and continuous specialist integration, making it particularly well-suited for collaborative and resource-constrained settings. Our framework sets a new direction for building scalable and efficient RSFMs. All codes and pretrained models are available at https://github.com/pierreadorni/EoS-FM.
翻译:基础模型的最新进展在自然语言处理和计算机视觉等领域展现出巨大潜力,地球观测领域也正涌现类似研究。这些模型旨在通过有限监督实现跨任务泛化,从而减少为每个任务单独训练模型的需求。然而,当前主要聚焦于扩大模型规模与数据集体量的策略需要极高的计算与数据资源,仅少数大型机构能够承担。此外,这种持续增大模型的范式与可持续及环境友好的人工智能原则形成鲜明对比,因其导致巨大的碳足迹与资源低效。本研究提出一种新颖高效替代方案:面向遥感基础模型构建的专家模型集成框架。该方法将训练过程分解为轻量级、任务特定的ConvNeXtV2专家模型,这些模型可被冻结并重复使用。这种模块化方法在效率、可解释性与可扩展性方面具有显著优势。同时,其天然支持联邦训练、模型剪枝与持续专家集成,特别适用于协作与资源受限场景。本框架为构建可扩展且高效的遥感基础模型指明了新方向。所有代码与预训练模型已发布于https://github.com/pierreadorni/EoS-FM。