【专知荟萃07】自动文摘AS知识资料全集（入门/进阶/代码/数据/专家等）(附pdf下载)

2017 年 11 月 6 日 专知专知内容组

点击上方“专知”关注获取更多AI知识!

【导读】主题荟萃知识是专知的核心功能之一，为用户提供AI领域系统性的知识学习服务。主题荟萃为用户提供全网关于该主题的精华（Awesome）知识资料收录整理，使得AI从业者便捷学习和解决工作问题！在专知人工智能主题知识树基础上，主题荟萃由专业人工编辑和算法工具辅助协作完成，并保持动态更新！另外欢迎对此创作主题荟萃感兴趣的同学，请加入我们专知AI创作者计划，共创共赢！今天专知为大家呈送第七篇专知主题荟萃-自动文摘Automatic Summarization知识资料大全集荟萃 （入门/进阶/论文/课程/会议/专家等等），请大家查看！专知访问www.zhuanzhi.ai, 或关注微信公众号后台回复" 专知"进入专知，搜索主题“自动文摘”查看。此外，我们也提供该文pdf下载链接，请文章末尾查看！此为初始版本，请大家指正补充，欢迎在后台留言！欢迎大家分享转发~

了解专知，专知，一个新的认知方式！

自动文摘 ( Automatic Summarization ) 专知荟萃

入门学习
进阶论文
代码
Tutorial
数据集
领域专家

自动文摘 ( Automatic summarization ) 专知荟萃

入门学习

自动文摘系列（1-13） [http://rsarxiv.github.io/tags/seq2seq/]
Text summarization with TensorFlow Google官方发布 [https://research.googleblog.com/2016/08/text-summarization-with-tensorflow.html]
Your tl;dr by an ai: a deep reinforced model for abstractive summarization 强化学习用于文档摘要 [https://einstein.ai/research/your-tldr-by-an-ai-a-deep-reinforced-model-for-abstractive-summarization]
教机器学习摘要 [https://zhuanlan.zhihu.com/p/21426100?refer=paperweekly]

进阶论文

Rasim M Alguliev, Ramiz M Aliguliyev, Makrufa S Hajirahimova, and Chingiz A Mehdiyev. 2011. MCMR: Maximum coverage and minimum redundant text summarization model. Expert Systems with Applications 38, 12 (2011), 14514–14522. [http://www.sciencedirect.com/science/article/pii/S0957417411008177]
Rasim M Alguliev, Ramiz M Aliguliyev, and Nijat R Isazade. 2013. Multiple documents summarization based on evolutionary optimization algorithm. Expert Systems with Applications 40, 5 (2013), 1675–1689. [http://www.sciencedirect.com/science/article/pii/S0957417412010688]
M. Allahyari, S. Pouriyeh, M. Assefi, S. Safaei, E. D. Trippe, J. B. Gutierrez, and K. Kochut. 2017. A Brief Survey of Text Mining: Classification, Clustering and Extraction Techniques. ArXiv e-prints (2017). arXiv:1707.02919 [https://arxiv.org/abs/1707.02919]
Einat Amitay and Cécile Paris. 2000. Automatically summarising web sites: is there a way around it?. In Proceedings of the ninth international conference on Information and knowledge management. ACM, 173–179. [https://dl.acm.org/citation.cfm?id=354756.354816]
Elena Baralis, Luca Cagliero, Saima Jabeen, Alessandro Fiori, and Sajid Shah. 2013. Multi-document summarization based on the Yago ontology. Expert Systems with Applications 40, 17 (2013), 6976–6984. [http://www.sciencedirect.com/science/article/pii/S0957417413004429]
Taylor Berg-Kirkpatrick, Dan Gillick, and Dan Klein. 2011. Jointly learning to extract and compress. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies-Volume 1. Association for Computational Linguistics, 481–490. [https://dl.acm.org/citation.cfm?id=2002534&amp%3bpreflayout=flat]
Asli Celikyilmaz and Dilek Hakkani-Tur. 2010. A hybrid hierarchical model for multi-document summarization. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 815–824. [https://dl.acm.org/citation.cfm?id=1858765]
Ping Chen and Rakesh Verma. 2006. A query-based medical information summarization system using ontology knowledge. In Computer-Based Medical Systems, 2006. CBMS 2006. 19th IEEE International Symposium on. IEEE, 37–42. [https://dl.acm.org/citation.cfm?id=1153019]
Freddy Chong Tat Chua and Sitaram Asur. 2013. Automatic Summarization of Events from Social Media.. In ICWSM. [https://www.aaai.org/ocs/index.php/ICWSM/ICWSM13/paper/view/6057/0]
John M Conroy and Dianne P O’leary. 2001. Text summarization via hidden markov models. In Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval. ACM, 406–407. [http://pdfs.semanticscholar.org/1213/3cfc6688cc2cdea57595b045a28b94d98f1d.pdf]
Hal Daumé III and Daniel Marcu. 2006. Bayesian query-focused summarization. In Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 305–312. [https://dl.acm.org/citation.cfm?id=1220214]
J-Y Delort, Bernadette Bouchon-Meunier, and Maria Rifqi. 2003. Enhanced web document summarization using hyperlinks. In Proceedings of the fourteenth ACM conference on Hypertext and hypermedia. ACM, 208–215. [http://dl.acm.org/citation.cfm?id=900097]
Günes Erkan and Dragomir R Radev. 2004. LexRank: Graph-based lexical centrality as salience in text summarization. J. Artif. Intell. Res.(JAIR) 22, 1 (2004), 457–479. [https://arxiv.org/abs/1109.2128]
Yihong Gong and Xin Liu. 2001. Generic text summarization using relevance measure and latent semantic analysis. In Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval. ACM, 19–25. [https://dl.acm.org/citation.cfm?doid=383952.383955]
Vishal Gupta and Gurpreet Singh Lehal. 2010. A survey of text summarization extractive techniques. Journal of Emerging Technologies in Web Intelligence 2, 3 (2010), 258–268. [http://www.learnpunjabi.org/pdf/survey-paper.pdf]
Ben Hachey, Gabriel Murray, and David Reitter. 2006. Dimensionality reduction aids term co-occurrence based multi-document summarization.In Proceedings of arXiv, July 2017, USA Allahyari, M. et al the workshop on task-focused summarization and question answering. Association for Computational Linguistics, 1–7. [http://www.ltg.ed.ac.uk/np/publications/ltg/papers/Hachey2006Dimensionality.pdf]
John Hannon, Kevin McCarthy, James Lynch, and Barry Smyth. 2011. Personalized and automatic social summarization of events in video. In Proceedings of the 16th international conference on Intelligent user interfaces. ACM, 335–338. [https://dl.acm.org/citation.cfm?id=1943459]
Sanda Harabagiu and Finley Lacatusu. 2005. Topic themes for multi-document summarization. In Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval. ACM, 202–209. [https://dl.acm.org/citation.cfm?id=1076071]
Leonhard Hennig, Winfried Umbrath, and Robert Wetzker. 2008. An ontologybased approach to text summarization. In Web Intelligence and Intelligent Agent Technology, 2008. WI-IAT’08. IEEE/WIC/ACM International Conference on, Vol. 3. IEEE, 291–294. [http://dl.acm.org/citation.cfm?id=1487345]
Meishan Hu, Aixin Sun, and Ee-Peng Lim. 2007. Comments-oriented blog summarization by sentence extraction. In Proceedings of the sixteenth ACM conference on Conference on information and knowledge management. ACM, 901–904. [https://dl.acm.org/citation.cfm?id=1321571&CFID=824361189&CFTOKEN=11022411]
Meishan Hu, Aixin Sun, and Ee-Peng Lim. 2008. Comments-oriented document summarization: understanding documents with readers’ feedback. In Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval. ACM, 291–298. [https://dl.acm.org/citation.cfm?id=1390385&CFID=824361189&CFTOKEN=11022411]
Elena Lloret and Manuel Palomar. 2012. Text summarisation in progress: a literature review. Artificial Intelligence Review 37, 1 (2012), 1–41. [https://link.springer.com/article/10.1007%2Fs10462-011-9216-z]
Hans Peter Luhn. 1958. The automatic creation of literature abstracts. IBM Journal of research and development 2, 2 (1958), 159–165. [39] Inderjeet Mani and Eric Bloedorn. 1999. Summarizing similarities and differences among related documents. Information Retrieval 1, 1-2 (1999), 35–67. [http://www.di.ubi.pt/~jpaulo/competence/general/1958)Luhn.pdf(]
Inderjeet Mani, Gary Klein, David House, Lynette Hirschman, Therese Firmin, and Beth Sundheim. 2002. SUMMAC: a text summarization evaluation. Natural Language Engineering 8, 01 (2002), 43–68.
Qiaozhu Mei and ChengXiang Zhai. 2008. Generating Impact-Based Summaries for Scientific Literature.. In ACL, Vol. 8. Citeseer, 816–824. [https://www.researchgate.net/publication/231901086_SUMMAC_a_text_summarization_evaluation]
Rada Mihalcea and Paul Tarau. 2004. TextRank: Bringing order into texts. Association for Computational Linguistics. [https://digital.library.unt.edu/ark:/67531/metadc30962/]
Rada Mihalcea and Paul Tarau. 2005. A language independent algorithm for single and multiple document summarization. (2005). [https://www.researchgate.net/publication/228340005_A_language_independent_algorithm_for_single_and_multiple_document_summarization]
Liu Na, Li Ming-xia, Lu Ying, Tang Xiao-jun, Wang Hai-wen, and Xiao Peng. 2014. Mixture of topic model for multi-document summarization. In Control and Decision Conference (2014 CCDC), The 26th Chinese. IEEE, 5168–5172. [http://ieeexplore.ieee.org/document/6853102/metrics]
Ani Nenkova and Amit Bagga. 2004. Facilitating email thread access by extractive summary generation. Recent advances in natural language processing III: selected papers from RANLP 2003 (2004), 287. [https://www.researchgate.net/publication/221303547_Facilitating_email_thread_access_by_extractive_summary_generation]
Ani Nenkova and Kathleen McKeown. 2012. A survey of text summarization techniques. In Mining Text Data. Springer, 43–76 [https://www.mendeley.com/research-papers/survey-text-summarization-techniques/]
Paula S Newman and John C Blitzer. 2003. Summarizing archived discussions: a beginning. In Proceedings of the 8th international conference on Intelligent user interfaces. ACM, 273–276. [https://dl.acm.org/citation.cfm?id=604097]
You Ouyang, Wenjie Li, Sujian Li, and Qin Lu. 2011. Applying regression models to query-focused multi-document summarization. Information Processing & Management 47, 2 (2011), 227–237. [http://www.sciencedirect.com/science/article/pii/S0306457310000257]
Makbule Gulcin Ozsoy, Ilyas Cicekli, and Ferda Nur Alpaslan. 2010. Text summarization of turkish texts using latent semantic analysis. In Proceedings of the 23rd international conference on computational linguistics. Association for Computational Linguistics, 869–876. [https://dl.acm.org/citation.cfm?id=1873879]
Vahed Qazvinian and Dragomir R Radev. 2008. Scientific paper summarization using citation summary networks. In Proceedings of the 22nd International Conference on Computational Linguistics-Volume 1. Association for Computational Linguistics, 689–696. [https://dl.acm.org/citation.cfm?id=1599081.1599168]
Vahed Qazvinian, Dragomir R Radev, Saif M Mohammad, Bonnie Dorr, David Zajic, Michael Whidby, and Taesun Moon. 2014. Generating extractive summaries of scientific paradigms. arXiv preprint arXiv:1402.0556 (2014). [https://www.researchgate.net/publication/229534087_Generating_surveys_of_scientific_paradigms]
Dragomir R Radev, Eduard Hovy, and Kathleen McKeown. 2002. Introduction to the special issue on summarization. Computational linguistics 28, 4 (2002), 399–408. [https://dl.acm.org/citation.cfm?id=638178.638179]
Dragomir R Radev, Hongyan Jing, and Malgorzata Budzikowska. 2000. Centroidbased summarization of multiple documents: sentence extraction, utility-based evaluation, and user studies. In Proceedings of the 2000 NAACL-ANLP Workshop on Automatic Summarization. Association for Computational Linguistics, 21– 30. [http://www.docin.com/p-853652484.html]
Dragomir R Radev, Hongyan Jing, Małgorzata Styś, and Daniel Tam. 2004. Centroid-based summarization of multiple documents. Information Processing & Management 40, 6 (2004), 919–938. [http://www.sciencedirect.com/science/article/pii/S0306457303000955]
Owen Rambow, Lokesh Shrestha, John Chen, and Chirsty Lauridsen. 2004. Summarizing email threads. In Proceedings of HLT-NAACL 2004: Short Papers. Association for Computational Linguistics, 105–108. [https://dl.acm.org/citation.cfm?id=1614011]
Zhaochun Ren, Shangsong Liang, Edgar Meij, and Maarten de Rijke. 2013. Personalized time-aware tweets summarization. In Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval. ACM, 513–522. [https://staff.fnwi.uva.nl/m.derijke/wp-content/papercite-data/pdf/ren-personalized-2013.pdf]
Horacio Saggion and Thierry Poibeau. 2013. Automatic text summarization: Past, present and future. In Multi-source, Multilingual Information Extraction and Summarization. Springer, 3–21. [https://hal.archives-ouvertes.fr/hal-00782442/document]
Gerard Salton and Christopher Buckley. 1988. Term-weighting approaches in automatic text retrieval. Information processing & management 24, 5 (1988), 513– 523. [http://www.sciencedirect.com/science/article/pii/0306457388900210]
Yogesh Sankarasubramaniam, Krishnan Ramanathan, and Subhankar Ghosh. 2014. Text summarization using Wikipedia. Information Processing & Management 50, 3 (2014), 443–461. [http://www.sciencedirect.com/science/article/pii/S0306457314000119]
Beaux P Sharifi, David I Inouye, and Jugal K Kalita. 2013. Summarization of Twitter Microblogs. Comput. J. (2013), bxt109. [http://cs.uccs.edu/~jkalita/papers/2013/SharifiBeauxComputerJournal2013.pdf]
E. D. Trippe, J. B. Aguilar, Y. H. Yan, M. V. Nural, J. A. Brady, M. Assefi, S. Safaei, M. Allahyari, S. Pouriyeh, M. R. Galinski, J. C. Kissinger, and J. B. Gutierrez. 2017. A Vision for Health Informatics: Introducing the SKED Framework.An Extensible Architecture for Scientific Knowledge Extraction from Data. ArXiv e-prints (2017). arXiv:1706.07992 [https://arxiv.org/abs/1706.07992]
Neural Summarization by Extracting Sentences and Words [https://arxiv.org/pdf/1603.07252.pdf]
Abstractive Text Summarization Using Sequence-to-Sequence RNNs and Beyond [https://arxiv.org/pdf/1602.06023.pdf]
A Neural Attention Model for Abstractive Sentence Summarization [https://arxiv.org/pdf/1509.00685.pdf]
A Deep Reinforced Model for Abstractive Summarization [https://arxiv.org/pdf/1705.04304.pdf]
Text summarization using Latent Semantic Analysis [https://www.researchgate.net/publication/220195824_Text_summarization_using_Latent_Semantic_Analysis\]
TextRank: Bringing Order into Textshttps://web.eecs.umich.edu/~mihalcea/papers/mihalcea.emnlp04.pdf~
Sentence Extraction Based Single Document Summarization [http://oldwww.iiit.ac.in/cgi-bin/techreports/display_detail.cgi?id=IIIT/TR/2008/97\] ### 综述
Automatic Summarization By Ani Nenkova and Kathleen McKeown [https://www.cis.upenn.edu/nenkova/1500000015-Nenkova.pdf]
Text Summarization Techniques: A Brief Survey [https://arxiv.org/pdf/1707.02268.pdf]
A SURVEY OF TEXT SUMMARIZATION TECHNIQUES [https://www.cs.bgu.ac.il/~elhadad/nlp16/nenkova-mckeown.pdf~]
Recent automatic text summarization techniques: a survey [[https://link.springer.com/article/10.1007/s10462-016-9475-9]]
近70年文本自动摘要研究综述刘家益邹益民 [http://210.76.106.46/qk/90051A/201707/672573777.html]

代码

Sequence-to-Sequence with Attention Model for Text Summarization.
[https://github.com/tensorflow/models/tree/master/research/textsum]
gensim.summarization offers TextRank summarization
https://radimrehurek.com/gensim/summarization/summariser.html

Tutorial

文本自动摘要：现状与未来万小军北京大学 2016年10月16日 [https://pan.baidu.com/s/1nuTUrSP]
Tutorial on automatic summarization [https://www.slideshare.net/dinel/orasan-ranlp2009] [https://pan.baidu.com/s/1o8bZJJk]
How to Run Text Summarization with TensorFlow [https://hackernoon.com/how-to-run-text-summarization-with-tensorflow-d4472587602d]
Text Summarization with Gensim [https://rare-technologies.com/text-summarization-with-gensim/]

数据集

DUC 2004 [http://www.cis.upenn.edu/~nlp/corpora/sumrepo.html\]
Opinosis Dataset - Topic related review sentences [http://kavita-ganesan.com/opinosis-opinion-dataset]
17 Timelines [http://kavita-ganesan.com/opinosis-opinion-dataset]
Legal Case Reports Data Set [http://archive.ics.uci.edu/ml/datasets/Legal+Case+Reports]