Developing and improving computational approaches to covering news can increase journalistic output and improve the way stories are covered. In this work we approach the problem of covering crime stories in Los Angeles. We present a machine-in-the-loop system that covers individual crimes by (1) learning the prototypical coverage archetypes from classical news articles on crime to learn their structure and (2) using output from the Los Angeles Police department to generate "lede paragraphs", first structural unit of crime-articles. We introduce a probabilistic graphical model for learning article structure and a rule-based system for generating ledes. We hope our work can lead to systems that use these components together to form the skeletons of news articles covering crime. This work was done for a class project in Jonathan May's Advanced Natural Language Processing Course, Fall, 2019.
翻译:发展并改进报道新闻的计算方法可以增加新闻产出,改进报道报道的方式。在这项工作中,我们处理报道洛杉矶犯罪故事的问题。我们展示了一个涵盖个别犯罪的机器在环形系统,其方法是:(1)从关于犯罪的古典新闻文章中学习原型报道,以了解其结构;(2)利用洛杉矶警察局的产出来产生“lede 段落”,第一个犯罪物品结构单元。我们引入了一个学习文章结构的概率图形模型和一个产生线索的有章可循的系统。我们希望我们的工作能够形成一个系统,利用这些部件共同形成关于犯罪的新闻文章的骨架。这项工作是为2019年秋季乔纳森·梅高级自然语言处理课程的一个班级项目完成的。