Modeling of high-dimensional data is very important to categorize different classes. We develop a new mixture model called Multinomial cluster-weighted model (MCWM). We derive the identifiability of a general class of MCWM. We estimate the proposed model through Expectation-Maximization (EM) algorithm via an iteratively reweighted least squares (EM-IRLS) and Stochastic Gradient Descent (EM-SGD). Model selection is carried out using different information criteria. Various Adjusted Rand Indices are considered as a different measure of accuracy. The clustering performance of the proposed model is investigated using simulated and real datasets. MCWM shows excellent clustering results via performance measures such as Accuracy and Area under the ROC curve.
翻译:对高维数据进行建模对于对不同类别进行分类非常重要。我们开发了一个新的混合物模型,称为多级集群加权模型(MCWM)。我们从中得出一个一般类别MCWM的可识别性。我们通过一个迭代再加权最小平方(EM-IRLS)和斯托切梯底(EM-SGD)来估计拟议的模型。模型选择采用不同的信息标准进行。各种调整的R和指数被视为一种不同的精确度。对拟议模型的组合性能使用模拟和真实数据集进行调查。MCWM通过诸如ROC曲线下的精度和地区等性能计量,显示了出色的集群结果。