与达夫尼开发经核实的无障碍评估方案个案研究 (Case studies of development of verified programs with Dafny for accessibility assessment)

from arxiv, Pre-print and extended version, including source code, of our paper accepted in FSEN 2023 - 10th IPM International Conference on Fundamentals of Software Engineering

Formal verification techniques aim at formally proving the correctness of a computer program with respect to a formal specification, but the expertise and effort required for applying formal specification and verification techniques and scalability issues have limited their practical application. In recent years, the tremendous progress with SAT and SMT solvers enabled the construction of a new generation of tools that promise to make formal verification more accessible for software engineers, by automating most if not all of the verification process. The Dafny system is a prominent example of that trend. However, little evidence exists yet about its accessibility. To help fill this gap, we conducted a set of 10 case studies of developing verified implementations in Dafny of some real-world algorithms and data structures, to determine its accessibility for software engineers. We found that, on average, the amount of code written for specification and verification purposes is of the same order of magnitude as the traditional code written for implementation and testing purposes (ratio of 1.14) -- an ``overhead'' that certainly pays off for high-integrity software. The performance of the Dafny verifier was impressive, with 2.4 proof obligations generated per line of code written, and 24 ms spent per proof obligation generated and verified, on average. However, we also found that the manual work needed in writing auxiliary verification code may be significant and difficult to predict and master. Hence, further automation and systematization of verification tasks are possible directions for future advances in the field.

翻译：正式的核查技术旨在正式证明计算机程序在正式规格方面的正确性,但应用正式规格和核查技术和可扩缩性问题所需的专门知识和努力限制了其实际应用;近年来,在SAT和SMT软件方面的巨大进展使得能够建造新一代工具,保证使软件工程师更容易获得正式核查,使大多数甚至全部的核查程序自动化。Dafny系统是这一趋势的一个突出例子。然而,有关其可获取性的证据仍然很少。为填补这一空白,我们进行了一套10个案例研究,在达夫尼开发一些真实世界的算法和数据结构的经核实的执行,以确定软件工程师的可及性。我们发现,平均而言,为规格和核查目的编写的代码数量与为实施和测试目的编写的传统代码(1.14的缩略)一样,这是一个“超头”系统,肯定能为高品质软件的可获取性带来回报。为了填补这一空白,我们进行了一套实地核查工作,按代码中的每一行生成了2.4项证据义务,我们又发现,为完成一项重大核查工作,并且为每一行的难度的进度,因此,我们为每一件所需要的校订的校订的校准工作可能完成。

相关内容

Automator

关注 5

Automator是苹果公司为他们的Mac OS X系统开发的一款软件。 只要通过点击拖拽鼠标等操作就可以将一系列动作组合成一个工作流，从而帮助你自动的（可重复的）完成一些复杂的工作。Automator还能横跨很多不同种类的程序，包括：查找器、Safari网络浏览器、iCal、地址簿或者其他的一些程序。它还能和一些第三方的程序一起工作，如微软的Office、Adobe公司的Photoshop或者Pixelmator等。

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

【Google可解释人工智能白皮书】27页pdf，AI Explainability Whitepaper ，Introduction to AI Explanations for AI Platform

专知会员服务

127+阅读 · 2019年12月13日

【Freddy Lecue博士】Thales嵌入式可解释AI：关键系统中AI的采用（Thales Embedded Explainable AI: Towards the Adoption of AI in Critical Systems.），AI Accelerator Summit 2019

专知会员服务

21+阅读 · 2019年11月11日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日