A fundamental task in AI is to assess (in)dependence between mixed-type variables (text, image, sound). We propose a Bayesian kernelised correlation test of (in)dependence using a Dirichlet process model. The new measure of (in)dependence allows us to answer some fundamental questions: Based on data, are (mixed-type) variables independent? How likely is dependence/independence to hold? How high is the probability that two mixed-type variables are more than just weakly dependent? We theoretically show the properties of the approach, as well as algorithms for fast computation with it. We empirically demonstrate the effectiveness of the proposed method by analysing its performance and by comparing it with other frequentist and Bayesian approaches on a range of datasets and tasks with mixed-type variables.
翻译:AI的一个基本任务是评估混合型变数(文本、图像、声音)之间的相互依存关系。我们建议使用dirichlet 进程模型对Bayesian 内部的相互依存关系进行测试。新的(n) 依赖性测量让我们能够回答一些根本问题:根据数据,(mix-type)变量是独立的吗?依赖性/独立的可能性有多大?两个混合型变数的概率有多大?我们理论上显示了该方法的特性,以及快速计算的方法。我们通过分析其性能和在混合型变数的一系列数据集和任务上与其他常客和贝叶斯人的方法进行比较,从经验上证明了拟议方法的有效性。