The generalization of 3D deep learning across multiple domains remains limited by the limited scale of existing datasets and the high heterogeneity of multi-source point clouds. Point clouds collected from different sensors (e.g., LiDAR scans and mesh-derived point clouds) exhibit substantial discrepancies in density and noise distribution, resulting in negative transfer during multi-domain fusion. Most existing approaches focus exclusively on either domain-aware or domain-general features, overlooking the potential synergy between them. To address this, we propose DoReMi (Domain-Representation Mixture), a Mixture-of-Experts (MoE) framework that jointly models Domain-aware Experts branch and a unified Representation branch to enable cooperative learning between specialized and generalizable knowledge. DoReMi dynamically activates domain-aware expert branch via Domain-Guided Spatial Routing (DSR) for context-aware expert selection and employs Entropy-Controlled Dynamic Allocation (EDA) for stable and efficient expert utilization, thereby adaptively modeling diverse domain distributions. Complemented by a frozen unified representation branch pretrained through robust multi-attribute self-supervised learning, DoReMi preserves cross-domain geometric and structural priors while maintaining global consistency. We evaluate DoReMi across multiple 3D understanding benchmarks. Notably, DoReMi achieves 80.1% mIoU on ScanNet Val and 77.2% mIoU on S3DIS, demonstrating competitive or superior performance compared to existing approaches, and showing strong potential as a foundation framework for future 3D understanding research. The code will be released soon.
翻译:三维深度学习在跨多个任务域的泛化能力仍受限于现有数据集的规模有限以及多源点云数据的高度异质性。从不同传感器(例如激光雷达扫描和网格衍生的点云)采集的点云在密度和噪声分布上存在显著差异,导致在多域融合过程中产生负迁移。现有方法大多仅专注于域感知特征或域通用特征,忽视了二者之间的潜在协同作用。为此,我们提出了DoReMi(任务域-表示混合),一种专家混合框架,它联合建模任务域感知专家分支和统一的表示分支,以实现专业化知识与可泛化知识之间的协同学习。DoReMi通过域引导空间路由动态激活任务域感知专家分支以进行上下文感知的专家选择,并采用熵控动态分配实现稳定高效的专家利用,从而自适应地建模多样化的任务域分布。辅以一个通过鲁棒的多属性自监督学习预训练并冻结的统一表示分支,DoReMi在保持全局一致性的同时,保留了跨域的几何与结构先验。我们在多个三维理解基准上评估了DoReMi。值得注意的是,DoReMi在ScanNet验证集上达到了80.1%的平均交并比,在S3DIS上达到了77.2%的平均交并比,展现出与现有方法相比具有竞争力或更优的性能,并显示出作为未来三维理解研究基础框架的强大潜力。代码即将发布。