Literary texts are usually rich in meanings and their interpretation complicates corpus studies and automatic processing. There have been several attempts to create collections of literary texts with annotation of literary elements like the author's speech, characters, events, scenes etc. However, they resulted in small collections and standalone rules for annotation. The present article describes an experiment on lexical annotation of text worlds in a literary work and quantitative methods of their comparison. The experiment shows that for a well-agreed tag assignment annotation rules should be set much more strictly. However, if borders between text worlds and other elements are the result of a subjective interpretation, they should be modeled as fuzzy entities.
翻译:文学文本通常含义丰富,其解释使文体研究和自动处理复杂化,曾几次试图建立文学文本汇编,对诸如作者的演讲、人物、事件、场景等文学内容进行说明,然而,这些文献汇编导致少量的文献汇编和单独的批注规则,本条款描述了在文学作品中对文字世界进行词汇说明的实验及其比较的定量方法。实验表明,对于经过充分商定的标记分配说明规则,应更严格地制定规则。然而,如果文字世界与其他要素之间的界限是主观解释的结果,那么它们应该作为模糊实体来建模。