Quantifying the difference between probability distributions is crucial in machine learning. However, estimating statistical divergences from empirical samples is challenging due to unknown underlying distributions. This work proposes the representation Jensen-Shannon divergence (RJSD), a novel measure inspired by the traditional Jensen-Shannon divergence. Our approach embeds data into a reproducing kernel Hilbert space (RKHS), representing distributions through uncentered covariance operators. We then compute the Jensen-Shannon divergence between these operators, thereby establishing a proper divergence measure between probability distributions in the input space. We provide estimators based on kernel matrices and empirical covariance matrices using Fourier features. Theoretical analysis reveals that RJSD is a lower bound on the Jensen-Shannon divergence, enabling variational estimation. Additionally, we show that RJSD is a higher-order extension of the maximum mean discrepancy (MMD), providing a more sensitive measure of distributional differences. Our experimental results demonstrate RJSD's superiority in two-sample testing, distribution shift detection, and unsupervised domain adaptation, outperforming state-of-the-art techniques. RJSD's versatility and effectiveness make it a promising tool for machine learning research and applications.
翻译:暂无翻译