We propose a framework of principal manifolds to model high-dimensional data. This framework is based on Sobolev spaces and designed to model data of any intrinsic dimension. It includes principal component analysis and principal curve algorithm as special cases. We propose a novel method for model complexity selection to avoid overfitting, eliminate the effects of outliers, and improve the computation speed. Additionally, we propose a method for identifying the interiors of circle-like curves and cylinder/ball-like surfaces. The proposed approach is compared to existing methods by simulations and applied to estimate tumor surfaces and interiors in a lung cancer study.
翻译:我们提出了一个模型高维数据的主要元体框架。这个框架以Sobolev空间为基础,旨在对任何内在层面的数据进行模型化。它包括主要组成部分分析和主要曲线算法作为特例。我们提出了一个新的模型复杂度选择方法,以避免超配,消除外部线的影响,提高计算速度。此外,我们提出了一种方法,用以确定圆形曲线和圆形/球形表面的内部。建议的方法通过模拟与现有方法进行比较,并用于在肺癌研究中估计肿瘤表面和内脏。