By producing summaries for long-running events, timeline summarization (TLS) underpins many information retrieval tasks. Successful TLS requires identifying an appropriate set of key dates (the timeline length) to cover. However, doing so is challenging as the right length can change from one topic to another. Existing TLS solutions either rely on an event-agnostic fixed length or an expert-supplied setting. Neither of the strategies is desired for real-life TLS scenarios. A fixed, event-agnostic setting ignores the diversity of events and their development and hence can lead to low-quality TLS. Relying on expert-crafted settings is neither scalable nor sustainable for processing many dynamically changing events. This paper presents a better TLS approach for automatically and dynamically determining the TLS timeline length. We achieve this by employing the established elbow method from the machine learning community to automatically find the minimum number of dates within the time series to generate concise and informative summaries. We applied our approach to four TLS datasets of English and Chinese and compared them against three prior methods. Experimental results show that our approach delivers comparable or even better summaries over state-of-art TLS methods, but it achieves this without expert involvement.
翻译:通过为长期活动编写摘要,时间总和(TLS)是许多信息检索任务的基础。成功的 TLS要求确定一套适当的关键日期(时间长度)以覆盖。但是,这样做具有挑战性,因为正确的时间长度可以从一个专题变化到另一个专题。现有的 TLS 解决方案要么依赖于一个事件不可知的固定长度或专家提供的设置。对于真实的 TLS 情景来说,这两个战略都不是理想的。固定的、事件不可知性设置忽视事件的多样性及其发展,从而可能导致低质量的 TLS 。依靠专家设计的设置既不可缩放,也不能持续地处理许多动态变化的事件。本文为自动和动态地确定TLS 时间长度提供了一个更好的 TLS 方法。我们通过从机器学习界采用固定的肘法,在时间序列中自动找到最起码的日期,以生成简明和内容丰富的摘要。我们用我们的方法对4个英文和中文的TRS数据集进行了比较,并将它们比对前三种方法进行比较。实验结果显示,我们的方法可以比照、甚至更好的TLS 。实验结果显示,我们的方法在州TLS 上实现了比专家的参与。