自动化主题模型评价是否破裂? :一致性的不一致 (Is Automated Topic Model Evaluation Broken?: The Incoherence of Coherence)

Topic model evaluation, like evaluation of other unsupervised methods, can be contentious. However, the field has coalesced around automated estimates of topic coherence, which rely on the frequency of word co-occurrences in a reference corpus. Contemporary neural topic models surpass classical ones according to these metrics. At the same time, topic model evaluation suffers from a validation gap: automated coherence, developed for classical models, has not been validated using human experimentation for neural models. In addition, a meta-analysis of topic modeling literature reveals a substantial standardization gap in automated topic modeling benchmarks. To address the validation gap, we compare automated coherence with the two most widely accepted human judgment tasks: topic rating and word intrusion. To address the standardization gap, we systematically evaluate a dominant classical model and two state-of-the-art neural models on two commonly used datasets. Automated evaluations declare a winning model when corresponding human evaluations do not, calling into question the validity of fully automatic evaluations independent of human judgments.

翻译：专题模型评价与其他未受监督的方法一样,可能会引起争议。然而,实地围绕专题一致性的自动估计,依靠参考材料中共同出现字数的频率,对专题一致性进行自动估计,这取决于在参考材料中共同出现字数的频率。当代神经专题模型比根据这些指标的经典模型要强。同时,专题模型评价也存在验证差距:为古典模型开发的自动化一致性,尚未在神经模型中使用人类实验进行验证。此外,对专题模型文献的元分析显示,在自动专题模型基准方面存在着巨大的标准化差距。为了解决验证差距,我们将自动一致性与两种最广泛接受的人类判断任务(专题评级和侵入字数)进行比较。为了解决标准化差距,我们系统地评价了两种常用数据集的主要经典模型和两种最先进的神经模型。一个自动评价在相应的人类评价不成功时宣布一个成功模型,从而质疑完全自动评价是否有效,而独立于人类判断。

相关内容

Automator

关注 5

Automator是苹果公司为他们的Mac OS X系统开发的一款软件。 只要通过点击拖拽鼠标等操作就可以将一系列动作组合成一个工作流，从而帮助你自动的（可重复的）完成一些复杂的工作。Automator还能横跨很多不同种类的程序，包括：查找器、Safari网络浏览器、iCal、地址簿或者其他的一些程序。它还能和一些第三方的程序一起工作，如微软的Office、Adobe公司的Photoshop或者Pixelmator等。

【干货书】数据科学基础，429页pdf，Foundations of Data Science

专知会员服务

65+阅读 · 2021年8月11日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

【ICML2020】噪声在随机梯度下降中的泛化效益，On the Generalization Benefit of Noise in Stochastic Gradient Descent

专知会员服务

19+阅读 · 2020年6月29日

【实用书】数据科学基础，484页pdf，Foundations of Data Science

专知会员服务

122+阅读 · 2020年5月28日