A rigorous formalization of desired system requirements is indispensable when performing any verification task. This often limits the application of verification techniques, as writing formal specifications is an error-prone and time-consuming manual task. To facilitate this, we present nl2spec, a framework for applying Large Language Models (LLMs) to derive formal specifications (in temporal logics) from unstructured natural language. In particular, we introduce a new methodology to detect and resolve the inherent ambiguity of system requirements in natural language: we utilize LLMs to map subformulas of the formalization back to the corresponding natural language fragments of the input. Users iteratively add, delete, and edit these sub-translations to amend erroneous formalizations, which is easier than manually redrafting the entire formalization. The framework is agnostic to specific application domains and can be extended to similar specification languages and new neural models. We perform a user study to obtain a challenging dataset, which we use to run experiments on the quality of translations. We provide an open-source implementation, including a web-based frontend.
翻译:在执行任何核查任务时,必须严格地将理想的系统要求正规化,这往往限制了核查技术的应用,因为编写正式规格是一项容易出错和耗时的人工任务。为了便利这项工作,我们提出nl2pec,这是应用大语言模型(LLLMs)从非结构化的自然语言中得出正式规格(在时间逻辑上)的框架。特别是,我们采用新方法来发现和解决自然语言系统要求固有的模糊性:我们利用LLMs绘制正规化的子形式图,使其返回输入的相应自然语言碎片。用户反复添加、删除和编辑这些分译,以修正错误的正规化,这比手动重拟整个正规化容易。这个框架对具体应用领域是不可知性的,可以扩大到类似的规格语言和新的神经模型。我们进行用户研究,以获得具有挑战性的数据集,我们用来进行翻译质量实验。我们提供开放源实施,包括基于网络的前端。</s>