The Cauchy-Schwarz (CS) divergence was developed by Pr\'{i}ncipe et al. in 2000. In this paper, we extend the classic CS divergence to quantify the closeness between two conditional distributions and show that the developed conditional CS divergence can be simply estimated by a kernel density estimator from given samples. We illustrate the advantages (e.g., the rigorous faithfulness guarantee, the lower computational complexity, the higher statistical power, and the much more flexibility in a wide range of applications) of our conditional CS divergence over previous proposals, such as the conditional KL divergence and the conditional maximum mean discrepancy. We also demonstrate the compelling performance of conditional CS divergence in two machine learning tasks related to time series data and sequential inference, namely the time series clustering and the uncertainty-guided exploration for sequential decision making.
翻译:2000年,Pr\'{i}ncipe等人(CS)发展了CAUCH-Schwarz(CS)差异。在本文件中,我们扩展了CS的经典差异,以量化两个有条件分布之间的近距离,并表明开发的有条件的CS差异只能由特定样本的内核密度估计器来估计。我们举例说明了我们有条件的CS对以前提案的分歧(如有条件的KL差异和有条件的最大平均值差异)的优势(如严格的忠诚保证、较低的计算复杂性、较高的统计能力以及广泛的应用灵活性)。我们还展示了有条件的CS在与时间序列数据和顺序推论相关的两个机器学习任务(即时间序列组合和对顺序决策的不确定性引导探索)中的有条件的CS差异。