A guided tour helps to visualise high-dimensional data by showing low-dimensional projections along a projection pursuit optimisation path. Projection pursuit is a generalisation of principal component analysis, in the sense that different indexes are used to define the interestingness of the projected data. While much work has been done in developing new indexes in the literature, less has been done on understanding the optimisation. Index functions can be noisy, might have multiple local maxima as well as an optimal maximum, and are constrained to generate orthonormal projection frames, which complicates the optimization. In addition, projection pursuit is primarily used for exploratory data analysis, and finding the local maxima is also useful. The guided tour is especially useful for exploration, because it conducts geodesic interpolation connecting steps in the optimisation and shows how the projected data changes as a maxima is approached. This work provides new visual diagnostics for examining a choice of optimisation procedure, based on the provision of a new data object which collects information throughout the optimisation. It has helped to diagnose and fix several problems with projection pursuit guided tour. This work might be useful more broadly for diagnosing optimisers, and comparing their performance. The diagnostics are implemented in the R package, ferrn.
翻译:导游有助于通过在投影追求优化路径上显示低维的预测来视觉高维数据。 投影跟踪是主要组成部分分析的概括化, 即使用不同的指数来定义预测数据的有趣性。 虽然在文献中开发新索引方面做了很多工作, 但在了解优化方面做得较少。 指数功能可能很吵, 可能具有多重本地最大和最佳的最大功能, 并被限制于生成异常的投影框架, 使优化复杂化。 此外, 投影跟踪主要用于探索性数据分析, 找到本地标准也是有用的。 导游对于探索特别有用, 因为它进行大地测量的内推, 因为它将优化的各个步骤连接起来, 并显示如何将预测的数据变化作为理想进行。 这项工作提供了新的视觉诊断, 以提供一个新的数据对象收集整个优化过程的信息。 它帮助诊断和解决了投影导航导游的一些问题。 这项工作对于探索特别有用, 因为它可以更广义地进行大地测量分析, 分析软件被应用了。