Most of the existing methods for estimating the local intrinsic dimension of a data distribution do not scale well to high-dimensional data. Many of them rely on a non-parametric nearest neighbors approach which suffers from the curse of dimensionality. We attempt to address that challenge by proposing a novel approach to the problem: Local Intrinsic Dimension estimation using approximate Likelihood (LIDL). Our method relies on an arbitrary density estimation method as its subroutine and hence tries to sidestep the dimensionality challenge by making use of the recent progress in parametric neural methods for likelihood estimation. We carefully investigate the empirical properties of the proposed method, compare them with our theoretical predictions, and show that LIDL yields competitive results on the standard benchmarks for this problem and that it scales to thousands of dimensions. What is more, we anticipate this approach to improve further with the continuing advances in the density estimation literature.
翻译:现有估算数据分配的本地内在层面的方法大多没有很好地推广到高维数据,其中许多方法依赖非参数最近的近邻方法,该方法受到维度的诅咒。我们试图通过提出一种新的方法来应对这一挑战:使用近似相似度(LIDL)进行本地内在层面估计。我们的方法依靠任意密度估计方法作为亚路径,因此试图利用对准神经系统方法的最新进展来避免维度挑战。我们仔细调查了拟议方法的经验特性,将其与我们的理论预测进行比较,并表明LIDL在该问题的标准基准上产生了竞争性结果,其规模达到数千维度。此外,我们预计这种方法将随着密度估计文献的持续进步而进一步改进。