Mixture of experts (MoE) are a popular class of statistical and machine learning models that have gained attention over the years due to their flexibility and efficiency. In this work, we consider Gaussian-gated localized MoE (GLoME) and block-diagonal covariance localized MoE (BLoME) regression models to present nonlinear relationships in heterogeneous data with potential hidden graph-structured interactions between high-dimensional predictors. These models pose difficult statistical estimation and model selection questions, both from a computational and theoretical perspective. This paper is devoted to the study of the problem of model selection among a collection of GLoME or BLoME models characterized by the number of mixture components, the complexity of Gaussian mean experts, and the hidden block-diagonal structures of the covariance matrices, in a penalized maximum likelihood estimation framework. In particular, we establish non-asymptotic risk bounds that take the form of weak oracle inequalities, provided that lower bounds for the penalties hold. The good empirical behavior of our models is then demonstrated on synthetic and real datasets.
翻译:专家混合(MoE)是一个受欢迎的统计和机器学习模型类别,多年来由于其灵活性和效率而引起人们的注意。在这项工作中,我们认为高斯加化的局部MOE(GLOME)和块对角共差局部MOE(BLOME)回归模型展示了不同数据的非线性关系,高维预测器之间可能隐藏的图形结构互动关系。这些模型从计算和理论角度提出了难以统计的估计和模型选择问题。本文专门研究以混合成分数量、高斯平均专家的复杂性和共差矩阵隐藏的块对角结构为特点的GLOME或BLOME模型集的模型选择问题。特别是,我们建立了非线性风险界限,这些界限以弱小或骨骼不平等的形式出现,但惩罚的界限较小。然后在合成和真实数据集中展示了我们模型的良好经验行为。