Are data groups which are pre-defined by expert opinions or medical diagnoses corresponding to groups based on statistical modeling? For which reason might observations be inconsistent? This contribution intends to answer both questions by proposing a novel multi-group Gaussian mixture model that accounts for the given group context while allowing high flexibility. This is achieved by assuming that the observations of a particular group originate not from a single distribution but from a Gaussian mixture of all group distributions. Moreover, the model provides robustness against cellwise outliers, thus against atypical data cells of the observations. The objective function can be formulated as a likelihood problem and optimized efficiently. We also derive the theoretical breakdown point of the estimators, an innovative result in this context to quantify the degree of robustness to cellwise outliers. Simulations demonstrate the excellent performance and the advantages to alternative models and estimators. Applications from different areas illustrate the strength of the method, particularly in investigating observations which are on the overlap of different groups.
翻译:暂无翻译