Story generation aims to generate a long narrative conditioned on a given input. In spite of the success of prior works with the application of pre-trained models, current neural models for Chinese stories still struggle to generate high-quality long text narratives. We hypothesise that this stems from ambiguity in syntactically parsing the Chinese language, which does not have explicit delimiters for word segmentation. Consequently, neural models suffer from the inefficient capturing of features in Chinese narratives. In this paper, we present a new generation framework that enhances the feature capturing mechanism by informing the generation model of dependencies between words and additionally augmenting the semantic representation learning through synonym denoising training. We conduct a range of experiments, and the results demonstrate that our framework outperforms the state-of-the-art Chinese generation models on all evaluation metrics, demonstrating the benefits of enhanced dependency and semantic representation learning.
翻译:故事的生成旨在产生一个以特定投入为条件的长篇叙事。尽管在应用预先培训的模型的先前工作中取得了成功,但目前中国故事的神经模型仍然难以产生高质量的长篇文字叙事。我们假设,这出自于在拼法上对中文的模棱两可,因为中文没有明确的文字分隔线。因此,神经模型由于在中文叙事中的特征捕捉效率低下而受到影响。在本文中,我们提出了一个新一代框架,通过向生成的单词依赖性模型提供信息和通过同义词分义培训进一步增加语义代言学的模型来增强特征捕捉机制。我们进行了一系列实验,结果显示,我们的框架超越了所有评价指标方面最先进的中国一代模型,显示了增强依赖性和语义代言义代言语学习的好处。