VT-SSum:视频条纹分解和概括的基准数据集 (VT-SSum: A Benchmark Dataset for Video Transcript Segmentation and Summarization)

Video transcript summarization is a fundamental task for video understanding. Conventional approaches for transcript summarization are usually built upon the summarization data for written language such as news articles, while the domain discrepancy may degrade the model performance on spoken text. In this paper, we present VT-SSum, a benchmark dataset with spoken language for video transcript segmentation and summarization, which includes 125K transcript-summary pairs from 9,616 videos. VT-SSum takes advantage of the videos from VideoLectures.NET by leveraging the slides content as the weak supervision to generate the extractive summary for video transcripts. Experiments with a state-of-the-art deep learning approach show that the model trained with VT-SSum brings a significant improvement on the AMI spoken text summarization benchmark. VT-SSum will be publicly available to support the future research of video transcript segmentation and summarization tasks.

翻译：记录誊本摘要是了解视频的一项基本任务。记录誊本摘要的常规方法通常以新闻文章等书面语言的汇总数据为基础,而域差可能会降低口述文本的示范性能。在本文中,我们提供了VT-SSum,这是一个带有口语的基准数据集,用于录像誊本分解和汇总,其中包括9 616个视频的125K条记录摘要。VT-SSum利用视频摘要的视频内容。NET利用幻灯片内容作为薄弱的监管,生成视频誊本的采掘摘要。以最先进的深层次学习方法进行的实验表明,用VT-SSum培训的模型大大改进了AMI语言文字摘要化基准。VT-SSum将向公众开放,以支持今后对录像记录分解和汇总任务的研究。

相关内容

数据集

关注 88

数据集，又称为资料集、数据集合或资料集合，是一种由数据所组成的集合。
Data set（或dataset）是一个数据的集合，通常以表格形式出现。每一列代表一个特定变量。每一行都对应于某一成员的数据集的问题。它列出的价值观为每一个变量，如身高和体重的一个物体或价值的随机数。每个数值被称为数据资料。对应于行数，该数据集的数据可能包括一个或多个成员。

机器学习组合优化

专知会员服务

110+阅读 · 2021年2月16日

【ICML2020】文本摘要生成模型PEGASUS

专知会员服务

35+阅读 · 2020年8月23日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

【SIGIR2020】一个统一的双视图模型，用于具有不一致性损失的评论总结和情绪分类，A Unified Dual-view Model for Review Summarization and Sentiment Classification with Inconsistency Loss

专知会员服务

22+阅读 · 2020年6月3日