Computing the expectation of some kernel function is ubiquitous in machine learning, from the classical theory of support vector machines, to exploiting kernel embeddings of distributions in applications ranging from probabilistic modeling, statistical inference, casual discovery, and deep learning. In all these scenarios, we tend to resort to Monte Carlo estimates as expectations of kernels are intractable in general. In this work, we characterize the conditions under which we can compute expected kernels exactly and efficiently, by leveraging recent advances in probabilistic circuit representations. We first construct a circuit representation for kernels and propose an approach to such tractable computation. We then demonstrate possible advancements for kernel embedding frameworks by exploiting tractable expected kernels to derive new algorithms for two challenging scenarios: 1) reasoning under missing data with kernel support vector regressors; 2) devising a collapsed black-box importance sampling scheme. Finally, we empirically evaluate both algorithms and show that they outperform standard baselines on a variety of datasets.
翻译:计算某些内核功能的预期在机器学习中是无处不在的,从支持矢量机的经典理论到利用分布在各种应用中的内核嵌入,这些应用包括概率模型、统计推断、偶然发现和深层学习。在所有这些假设中,我们倾向于采用蒙特卡洛的估计,因为对内核的预期一般是难以实现的。在这项工作中,我们通过利用概率性电路图的最新进展,确定能够准确和高效地计算预期内核的条件。我们首先为内核建立一个电路代表,并提议一种方法来进行这种可移植的计算。我们然后通过利用可移动的预期内核嵌入框架来为两种具有挑战性的设想制定新的算法,即:(1) 利用内核支持矢量反射器在缺失数据下推理;(2) 设计一个崩溃的黑盒重要取样计划。最后,我们从经验上评估两种算法,并显示它们超越了各种数据集的标准基线。