This paper presents an NLP (Natural Language Processing) approach to detecting spoilers in book reviews, using the University of California San Diego (UCSD) Goodreads Spoiler dataset. We explored the use of LSTM, BERT, and RoBERTa language models to perform spoiler detection at the sentence-level. This was contrasted with a UCSD paper which performed the same task, but using handcrafted features in its data preparation. Despite eschewing the use of handcrafted features, our results from the LSTM model were able to slightly exceed the UCSD team's performance in spoiler detection.
翻译:本文介绍了利用加利福尼亚圣地亚哥大学的Goodread Spoiler数据集在书评中发现破坏者的国家语言处理(NLP)方法,我们探讨了如何使用LSTM、BERT和RoBERTA语言模型在判决一级进行破坏者探测,这与UCSD的文件形成对比,后者执行同样的任务,但在编制数据时使用手工制作的特征。尽管避免使用手工制作的特征,但我们LSTM模型的结果略高于UCSD团队在破坏者探测方面的表现。