Bayesian Lentent Dirichlet 分配中一般错误的精确的无症状形式 (The Exact Asymptotic Form of Bayesian Generalization Error in Latent Dirichlet Allocation)

Latent Dirichlet allocation (LDA) obtains essential information from data by using Bayesian inference. It is applied to knowledge discovery via dimension reducing and clustering in many fields. However, its generalization error had not been yet clarified since it is a singular statistical model where there is no one-to-one mapping from parameters to probability distributions. In this paper, we give the exact asymptotic form of its generalization error and marginal likelihood, by theoretical analysis of its learning coefficient using algebraic geometry. The theoretical result shows that the Bayesian generalization error in LDA is expressed in terms of that in matrix factorization and a penalty from the simplex restriction of LDA's parameter region. A numerical experiment is consistent to the theoretical result.

翻译：Drichlet 分配(LDA) 利用Bayesian推论从数据中获取基本信息,用于通过减少维度和在许多领域分组进行知识发现,然而,其一般化错误尚未澄清,因为它是一个单一的统计模型,没有从参数到概率分布的一对一绘图。在本文中,我们通过使用代数几何法对其学习系数进行理论分析,给出其一般化错误和边际可能性的准确无症状形式。理论结果表明,Bayesian 常规化错误表现在矩阵系数化和LDA参数区域简单x限制的处罚中。一个数字实验与理论结果一致。

相关内容

泛化误差

关注 107

学习方法的泛化能力（Generalization Error）是由该方法学习到的模型对未知数据的预测能力，是学习方法本质上重要的性质。现实中采用最多的办法是通过测试泛化误差来评价学习方法的泛化能力。泛化误差界刻画了学习算法的经验风险与期望风险之间偏差和收敛速度。一个机器学习的泛化误差（Generalization Error），是一个描述学生机器在从样品数据中学习之后，离教师机器之间的差距的函数。

【ETH】最新《几何数据分析》2020课程，附PPT下载

专知会员服务

44+阅读 · 2020年12月18日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

知识图谱推理，50页ppt，Salesforce首席科学家Richard Socher

专知会员服务

111+阅读 · 2020年6月10日