This paper addresses the problem of mapping high-dimensional data to a low-dimensional space, in the presence of other known features. This problem is ubiquitous in science and engineering as there are often controllable/measurable features in most applications. Furthermore, the discovered features in previous analyses can become the known features in subsequent analyses, repeatedly. To solve this problem, this paper proposes a broad class of methods, which is referred to as conditional multidimensional scaling. An algorithm for optimizing the objective function of conditional multidimensional scaling is also developed. The proposed framework is illustrated with kinship terms, facial expressions, and simulated car-brand perception examples. These examples demonstrate the benefits of the framework for being able to marginalize out the known features to uncover unknown, unanticipated features in the reduced-dimension space and for enabling a repeated, more straightforward knowledge discovery process. Computer codes for this work are available in the open-source cml R package.
翻译:本文探讨了在有其他已知特征的情况下将高维数据绘图到低维空间的问题,这个问题在科学和工程领域普遍存在,因为在大多数应用中往往有可控/可计量的特征。此外,以往分析中发现的特征可以反复成为随后分析中已知的特征。为解决这一问题,本文件提出了一系列广泛的方法,称为有条件的多层面缩放。还制定了优化有条件的多维缩放目标功能的算法。拟议的框架以近亲术语、面部表情和模拟汽车品牌认知示例加以说明。这些实例显示了框架的好处,即能够将已知的特征排挤到边缘,以发现缩小分散空间中的未知、未预见的特征,并促成反复、更直接的知识发现过程。在开源的 cml R 软件包中可以找到这项工作的计算机代码。