Computing the expectation of kernel functions is a ubiquitous task in machine learning, with applications from classical support vector machines to exploiting kernel embeddings of distributions in probabilistic modeling, statistical inference, causal discovery, and deep learning. In all these scenarios, we tend to resort to Monte Carlo estimates as expectations of kernels are intractable in general. In this work, we characterize the conditions under which we can compute expected kernels exactly and efficiently, by leveraging recent advances in probabilistic circuit representations. We first construct a circuit representation for kernels and propose an approach to such tractable computation. We then demonstrate possible advancements for kernel embedding frameworks by exploiting tractable expected kernels to derive new algorithms for two challenging scenarios: 1) reasoning under missing data with kernel support vector regressors; 2) devising a collapsed black-box importance sampling scheme. Finally, we empirically evaluate both algorithms and show that they outperform standard baselines on a variety of datasets.
翻译:计算内核功能的预期值是机器学习中无处不在的任务, 由古典支持矢量机器应用来利用内核内嵌嵌入在概率模型、 统计推断、 因果发现和深层学习中的分布。 在所有这些情况下, 我们倾向于采用蒙特卡洛估计值, 因为对内核的预期一般是难以解决的。 在这项工作中, 我们通过利用概率电路图的最新进展, 来精确和高效地计算预期内核的条件。 我们首先为内核建立一个电路代表器, 并提议一种方法来进行这种可移植的计算。 我们然后展示内核内嵌框架的可能进展, 利用可移动的预期内核为两种具有挑战性的情况制定新的算法:(1) 在缺少数据的情况下, 借助内核支持矢量反射器进行推理;(2) 设计一个崩溃的黑箱重要取样方案。 最后, 我们从经验上评价两种算法, 并显示它们超越了各种数据集的标准基线 。