Textual entailment recognition is one of the basic natural language understanding(NLU) tasks. Understanding the meaning of sentences is a prerequisite before applying any natural language processing(NLP) techniques to automatically recognize the textual entailment. A text entails a hypothesis if and only if the true value of the hypothesis follows the text. Classical approaches generally utilize the feature value of each word from word embedding to represent the sentences. In this paper, we propose a novel approach to identifying the textual entailment relationship between text and hypothesis, thereby introducing a new semantic feature focusing on empirical threshold-based semantic text representation. We employ an element-wise Manhattan distance vector-based feature that can identify the semantic entailment relationship between the text-hypothesis pair. We carried out several experiments on a benchmark entailment classification(SICK-RTE) dataset. We train several machine learning(ML) algorithms applying both semantic and lexical features to classify the text-hypothesis pair as entailment, neutral, or contradiction. Our empirical sentence representation technique enriches the semantic information of the texts and hypotheses found to be more efficient than the classical ones. In the end, our approach significantly outperforms known methods in understanding the meaning of the sentences for the textual entailment classification task.
翻译:文本意味着承认是基本的自然语言理解( NLU) 任务之一。 理解句子的含义是应用任何自然语言处理( NLP) 技术来自动识别文本包含的( NLP) 的先决条件。 文本意味着假设的真实值随文本而来, 文本意味着一种假设。 经典方法通常使用从文字嵌入到代表句子的每个词的特性值。 在本文中, 我们提出一种新的方法, 确定文本和假设之间的文本包含的文字关系, 从而引入一个新的语义特征, 侧重于基于经验的门槛语义表达法。 我们使用基于元素的曼哈顿远程矢量处理( NLP) 技术, 可以识别文本- 假言对的语包含关系。 我们在基准要求分类( SICK- RTE) 数据集上进行了几次实验。 我们培训了几种机器学习( ML) 算法, 应用语义和词汇特征来将文本- 假设配对归类为包含的、 中性或矛盾性。 我们的经验性句表述技术丰富了文本的语义信息, 和假设矢测方式, 意味着我们所知道的文字分类方法的效率。