Smoothing is an essential tool in many NLP tasks, therefore numerous techniques have been developed for this purpose in the past. One of the most widely used smoothing methods are the Kneser-Ney smoothing (KNS) and its variants, including the Modified Kneser-Ney smoothing (MKNS), which are widely considered to be among the best smoothing methods available. Although when creating the original KNS the intention of the authors was to develop such a smoothing method that preserves the marginal distributions of the original model, this property was not maintained when developing the MKNS. In this article I would like to overcome this and propose such a refined version of the MKNS that preserves these marginal distributions while keeping the advantages of both previous versions. Beside its advantageous properties, this novel smoothing method is shown to achieve about the same results as the MKNS in a standard language modelling task.
翻译:平滑是许多全国劳工计划任务的一个重要工具,因此,过去已经为此开发了许多技术。最广泛使用的平滑方法之一是Kneser-Ney平滑(KNS)及其变体,包括修改过的Kneser-Ney平滑(MKNS)及其变体,这些变体被广泛认为是现有的最佳平滑方法之一。虽然在创建最初的KNS时,作者的意图是开发这样一种平滑方法,以保持原始模型的边际分布,但在开发MKNS时,这一属性并未得到维护。在本条中,我想克服这一点,并提议一个精细的MKNS版本,既保存这些边际分布,又保留前两种版本的优势。除了其优点外,这种新颖的平滑方法在标准语言建模任务中可以取得与MKNS相同的结果。