Recent advances in foundation models have shown great promise in domains such as natural language processing and computer vision, and similar efforts are now emerging in the Earth Observation community. These models aim to generalize across tasks with limited supervision, reducing the need for training separate models for each task. However, current strategies, which largely focus on scaling model size and dataset volume, require prohibitive computational and data resources, limiting accessibility to only a few large institutions. Moreover, this paradigm of ever-larger models stands in stark contrast with the principles of sustainable and environmentally responsible AI, as it leads to immense carbon footprints and resource inefficiency. In this work, we present a novel and efficient alternative: an Ensemble-of-Specialists framework for building Remote Sensing Foundation Models (RSFMs). Our method decomposes the training process into lightweight, task-specific ConvNeXtV2 specialists that can be frozen and reused. This modular approach offers strong advantages in efficiency, interpretability, and extensibility. Moreover, it naturally supports federated training, pruning, and continuous specialist integration, making it particularly well-suited for collaborative and resource-constrained settings. Our framework sets a new direction for building scalable and efficient RSFMs.
翻译:近年来,基础模型在自然语言处理和计算机视觉等领域展现出巨大潜力,地球观测领域也正涌现类似研究。这些模型旨在以有限监督实现跨任务泛化,从而减少为每个任务单独训练模型的需求。然而,当前主要聚焦于扩大模型规模与数据量的策略需要极高的计算与数据资源,仅少数大型机构能够承担。此外,这种持续增大模型的范式与可持续、环境友好的人工智能原则形成鲜明对比,因其导致巨大的碳足迹与资源低效。本文提出一种新颖高效替代方案:用于构建遥感基础模型的专家集成框架。该方法将训练过程分解为轻量级、任务特定的ConvNeXtV2专家模型,这些模型可被冻结并重复使用。这种模块化方法在效率、可解释性和可扩展性方面具有显著优势。此外,它天然支持联邦训练、剪枝和持续专家集成,特别适用于协作与资源受限场景。本框架为构建可扩展且高效的遥感基础模型指明了新方向。