Transformer-based models have achieved impressive performance on various Natural Language Inference (NLI) benchmarks, when trained on respective training datasets. However, in certain cases, training samples may not be available or collecting them could be time-consuming and resource-intensive. In this work, we address this challenge and present an explorative study on unsupervised NLI, a paradigm in which no human-annotated training samples are available. We investigate NLI under three challenging settings: PH, P, and NPH that differ in the extent of unlabeled data available for learning. As a solution, we propose a procedural data generation approach that leverages a set of sentence transformations to collect PHL (Premise, Hypothesis, Label) triplets for training NLI models, bypassing the need for human-annotated training datasets. Comprehensive experiments show that this approach results in accuracies of 66.75%, 65.9%, 65.39% in PH, P, NPH settings respectively, outperforming all existing baselines. Furthermore, fine-tuning our models with as little as ~0.1% of the training dataset (500 samples) leads to 12.2% higher accuracy than the model trained from scratch on the same 500 instances.
翻译:以变换器为基础的模型在各种自然语言推断基准(NLI)上取得了令人印象深刻的成绩,在培训了各自的培训数据集之后,在各种自然语言推断基准(NLI)基准上取得了令人印象深刻的成绩,但在某些情况下,培训样本可能无法提供或收集可能是耗时和资源密集型的培训样本。在这项工作中,我们应对这一挑战,并对未经监督的NLI进行探索性研究,这个模型没有人类附加说明的培训样本。我们在三种具有挑战性的环境中对非自然语言推断基准(PH、P和NPH)进行了调查,这三种环境与可供学习使用的未贴标签数据不同。作为一种解决办法,我们建议采用程序数据生成方法,利用一套刑罚转换方法收集PHL(预设、Hypothesis、Label)模型,用于培训NLIF模式的三重(PHLI模型),绕过对人附加说明的培训数据集的需要。全面实验表明,这一方法导致66.75%、65.9%、65.39 % PH、PH、65.39 % 现有所有基线都不同。此外,将我们的模型的精确度微调整为PHLM,比培训样本高出0.10.1%。