Both humans and machines learn the meaning of unknown words through contextual information in a sentence, but not all contexts are equally helpful for learning. We introduce an effective method for capturing the level of contextual informativeness with respect to a given target word. Our study makes three main contributions. First, we develop models for estimating contextual informativeness, focusing on the instructional aspect of sentences. Our attention-based approach using pre-trained embeddings demonstrates state-of-the-art performance on our single-context dataset and an existing multi-sentence context dataset. Second, we show how our model identifies key contextual elements in a sentence that are likely to contribute most to a reader's understanding of the target word. Third, we examine how our contextual informativeness model, originally developed for vocabulary learning applications for students, can be used for developing better training curricula for word embedding models in batch learning and few-shot machine learning settings. We believe our results open new possibilities for applications that support language learning for both human and machine learners
翻译:人类和机器都通过句子中的上下文信息学习未知词的含义,但并非所有情况都同样有助于学习。我们引入了一种有效的方法来掌握特定目标词的上下文信息水平。我们的研究做出了三大贡献。首先,我们开发了用于估计背景信息性的模型,重点是判决的教学方面。我们使用预先培训的嵌入器的基于关注的方法展示了我们单文本数据集和现有的多语系背景数据集的最新表现。第二,我们展示了我们的模型如何在一句中确定关键背景要素,这些要素很可能对读者对目标词的理解做出最大贡献。第三,我们审视了我们最初为学生词汇学习应用程序开发的上下文信息性模型,如何用于开发更好的词汇嵌入模块的培训课程,用于批数学习和几发机器学习环境。我们认为,我们的成果为支持人类和机器学习者语言学习的应用开辟了新的可能性。