Bayesian 群集搜索值和损失函数 (Search Algorithms and Loss Functions for Bayesian Clustering)

We propose a randomized greedy search algorithm to find a point estimate for a random partition based on a loss function and posterior Monte Carlo samples. Given the large size and awkward discrete nature of the search space, the minimization of the posterior expected loss is challenging. Our approach is a stochastic search based on a series of greedy optimizations performed in a random order and is embarrassingly parallel. We consider several loss functions, including Binder loss and variation of information. We note that criticisms of Binder loss are the result of using equal penalties of misclassification and we show an efficient means to compute Binder loss with potentially unequal penalties. Furthermore, we extend the original variation of information to allow for unequal penalties and show no increased computational costs. We provide a reference implementation of our algorithm. Using a variety of examples, we show that our method produces clustering estimates that better minimize the expected loss and are obtained faster than existing methods.

翻译：我们建议一种随机的贪婪搜索算法,以根据损失函数和蒙特卡洛后遗物样本寻找随机分割的点估计值。鉴于搜索空间的大小和尴尬的离散性质,尽可能减少后遗物预期损失是具有挑战性的。我们的方法是基于一系列随机的贪婪优化进行随机和令人尴尬的平行的随机搜索。我们考虑了若干损失功能,包括Binder损失和信息变异。我们注意到,对Binder损失的批评是使用同样的分类处罚的结果,我们展示了一种以可能不平等的处罚计算Binder损失的有效手段。此外,我们扩展了最初的信息变异,允许不平等的处罚,并且没有显示更高的计算成本。我们提供了我们的算法的参考实施。我们用多种例子显示,我们的方法生成了集束估计,以更好地尽量减少预期的损失,并且比现有方法更快的速度获得。

相关内容

损失函数（机器学习）

关注 10

损失函数，在AI中亦称呼距离函数，度量函数。此处的距离代表的是抽象性的，代表真实数据与预测数据之间的误差。损失函数（loss function）是用来估量你模型的预测值f(x)与真实值Y的不一致程度，它是一个非负实值函数,通常使用L(Y, f(x))来表示，损失函数越小，模型的鲁棒性就越好。损失函数是经验风险函数的核心部分，也是结构风险函数重要组成部分。

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

专知会员服务

39+阅读 · 2020年11月3日

一份简单《图神经网络》教程，28页ppt

专知会员服务

126+阅读 · 2020年8月2日

商业数据分析，39页ppt

专知会员服务

165+阅读 · 2020年6月2日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》