Millions of people irrespective of socioeconomic and demographic backgrounds, depend on Wikipedia articles everyday for keeping themselves informed regarding popular as well as obscure topics. Articles have been categorized by editors into several quality classes, which indicate their reliability as encyclopedic content. This manual designation is an onerous task because it necessitates profound knowledge about encyclopedic language, as well navigating circuitous set of wiki guidelines. In this paper we propose Neural wikipedia QualityMonitor (NwQM), a novel deep learning model which accumulates signals from several key information sources such as article text, meta data and images to obtain improved Wikipedia article representation. We present comparison of our approach against a plethora of available solutions and show 8% improvement over state-of-the-art approaches with detailed ablation studies.
翻译:无论社会经济和人口背景如何,数百万人每天依靠维基百科文章来了解受欢迎和隐蔽的主题。编辑们将文章分为若干质量类,表明其作为百科全书内容的可靠性。这一人工称谓是一项艰巨的任务,因为它要求深入了解百科全书语言,以及浏览一套循环的维基指南。在本文中,我们提出了新颖的深层次学习模式(NwQM),从文章文本、元数据和图像等几个关键信息来源收集信号,以获得更好的维基百科文章表述。我们比较了我们的方法与大量现有解决方案,并展示了比最新方法改进8%,并进行了详细的通俗研究。