These notes are an overview of some classical linear methods in Multivariate Data Analysis. This is an good old domain, well established since the 60's, and refreshed timely as a key step in statistical learning. It can be presented as part of statistical learning, or as dimensionality reduction with a geometric flavor. Both approaches are tightly linked: it is easier to learn patterns from data in low dimensional spaces than in high-dimensional spaces. It is shown how a diversity of methods and tools boil down to a single core methods, PCA with SVD, such that the efforts to optimize codes for analyzing massive data sets can focus on this shared core method, and benefit to all methods. An extension to the study of several arrays is presented (Canonical Analysis).
翻译:这些注释是多变量数据分析中一些古典线性方法的概览。 这是一个良好的旧领域,自60年代以来就早已确立,并且作为统计学习的关键步骤及时更新。它可以作为统计学习的一部分,或者作为以几何为口味的维度减少来展示。两种方法都是紧密相连的:从低维空间的数据中学习模式比在高维空间中学习模式更容易。它显示了多种方法和工具如何归结为单一核心方法,即具有SVD的五氯苯甲醚,因此优化分析大规模数据集的代码的努力可以侧重于这一共享核心方法,并有利于所有方法。对多个阵列的研究进行了扩展(Canonical analysision)。