An analysis of high-dimensional data can offer a detailed description of a system but is often challenged by the curse of dimensionality. General dimensionality reduction techniques can alleviate such difficulty by extracting a few important features, but they are limited due to the lack of interpretability and connectivity to actual decision making associated with each physical variable. Variable selection techniques, as an alternative, can maintain the interpretability, but they often involve a greedy search that is susceptible to failure in capturing important interactions or a metaheuristic search that requires extensive computations. This research proposes a new method that produces subspaces, reduced-dimensional physical spaces, based on a randomized search and leverages an ensemble of critical subspace-based models, achieving dimensionality reduction and variable selection. When applied to high-dimensional data collected from the failure prediction of a composite/metal hybrid structure exhibiting complex progressive damage failure under loading, the proposed method outperforms the existing and potential alternatives in prediction and important variable selection.
翻译:对高维数据的分析可以提供系统的详细描述,但往往受到维度诅咒的挑战。一般维度减少技术可以通过提取几个重要特征来减轻这种困难,但由于每个物理变量缺乏可解释性和与实际决策的连通性,这些技术是有限的。变量选择技术可以保持可解释性,但通常涉及贪婪的搜索,这种搜索很容易无法捕捉到重要的相互作用或需要广泛计算的计量经济学搜索。这一研究提出了一种新的方法,在随机搜索的基础上,产生子空间,减少维度物理空间,利用关键次空间模型的组合,实现维度减少和变量选择。在应用从复合/金属混合结构的预测失败中收集的高维度数据时,如果在装载过程中显示复杂的累进损害失,拟议方法将超出预测和重要变量选择中现有的和潜在的替代方法。