Deaf and hard of hearing individuals regularly rely on captioning while watching live TV. Live TV captioning is evaluated by regulatory agencies using various caption evaluation metrics. However, caption evaluation metrics are often not informed by preferences of DHH users or how meaningful the captions are. There is a need to construct caption evaluation metrics that take the relative importance of words in a transcript into account. We conducted correlation analysis between two types of word embeddings and human-annotated labeled word-importance scores in existing corpus. We found that normalized contextualized word embeddings generated using BERT correlated better with manually annotated importance scores than word2vec-based word embeddings. We make available a pairing of word embeddings and their human-annotated importance scores. We also provide proof-of-concept utility by training word importance models, achieving an F1-score of 0.57 in the 6-class word importance classification task.
翻译:听力和听力困难的个人在观看现场电视时经常依赖字幕。 现场电视字幕由监管机构使用各种字幕评价指标进行评估。 但是,标题评价指标往往不因DHH用户的偏好或字幕的有意义的程度而了解。 有必要构建在记录稿中考虑到文字相对重要性的字幕评价指标。 我们在两种类型的单词嵌入和现有文体中贴有标签的文字重要性评分之间进行了相关分析。 我们发现,使用BERT生成的标准化背景字嵌入比基于 word2vec 的单词嵌入的手动附加重要分数更好。 我们提供配对的单词嵌入及其人附加重要分数。 我们还通过培训名重要性模型提供证明概念的实用性,在6级词汇重要性分类任务中达到0.57的F1分。