In the study of high-dimensional data, it is often assumed that the data set possesses an underlying lower-dimensional structure. A practical model for this structure is an embedded compact manifold with boundary. Since the underlying manifold structure is typically unknown, identifying boundary points from the data distributed on the manifold is crucial for various applications. In this work, we propose a method for detecting boundary points inspired by the widely used locally linear embedding algorithm. We implement this method using two nearest neighborhood search schemes: the $\epsilon$-radius ball scheme and the $K$-nearest neighbor scheme. This algorithm incorporates the geometric information of the data structure, particularly through its close relation with the local covariance matrix. We discuss the selection the key parameter and analyze the algorithm through our exploration of the spectral properties of the local covariance matrix in both neighborhood search schemes. Furthermore, we demonstrate the algorithm's performance with simulated examples.
翻译:暂无翻译