The stability of persistent homology has led to wide applications of the persistence diagram as a trusted topological descriptor in the presence of noise. However, with the increasing demand for high-dimension and low-sample-size data processing in modern science, it is questionable whether persistence diagrams retain their reliability in the presence of high-dimensional noise. This work aims to study the reliability of persistence diagrams in the high-dimension low-sample-size data setting. By analyzing the asymptotic behavior of persistence diagrams for high-dimensional random data, we show that persistence diagrams are no longer reliable descriptors of low-sample-size data under high-dimensional noise perturbations. We refer to this loss of reliability of persistence diagrams in such data settings as the curse of dimensionality on persistence diagrams. Next, we investigate the possibility of using normalized principal component analysis as a method for reducing the dimensionality of the high-dimensional observed data to resolve the curse of dimensionality. We show that this method can mitigate the curse of dimensionality on persistence diagrams. Our results shed some new light on the challenges of processing high-dimension low-sample-size data by persistence diagrams and provide a starting point for future research in this area.
翻译:暂无翻译