The self-organizing map is an unsupervised neural network which is widely used for data visualisation and clustering in the field of chemometrics. The classical Kohonen algorithm that computes self-organizing maps is suitable only for complete data without any missing values. However, in many applications, partially observed data are the norm. In this paper, we propose an extension of self-organizing maps to incomplete data via a new criterion that also defines estimators of the missing values. In addition, an adaptation of the Kohonen algorithm, named missSOM, is provided to compute these self-organizing maps and impute missing values. An efficient implementation is provided. Numerical experiments on simulated data and a chemical dataset illustrate the short computing time of missSOM and assess its performance regarding various criteria and in comparison to the state of the art.
翻译:自组织地图是一个不受监督的神经网络,广泛用于化学测量领域的数据可视化和组群。计算自组织地图的古典Kokoonen算法只适用于完整数据,没有遗漏任何数值。然而,在许多应用中,观察到的部分数据是常态。在本文件中,我们提议通过一个新标准将自组织地图扩大到不完整数据,该新标准也界定了缺失值的测算器。此外,还提供科霍宁算法的修改,名为MissesSOM,以计算自组织地图和浸透缺失值。提供了有效的实施。模拟数据的量化实验和化学数据集说明了误用SOM的计算时间短,并评估了它关于各种标准和与艺术状态相比较的性能。