Input constraints are useful for many software development tasks. For example, input constraints of a function enable the generation of valid inputs, i.e., inputs that follow these constraints, to test the function deeper. API functions of deep learning (DL) libraries have DL specific input constraints, which are described informally in the free form API documentation. Existing constraint extraction techniques are ineffective for extracting DL specific input constraints. To fill this gap, we design and implement a new technique, DocTer, to analyze API documentation to extract DL specific input constraints for DL API functions. DocTer features a novel algorithm that automatically constructs rules to extract API parameter constraints from syntactic patterns in the form of dependency parse trees of API descriptions. These rules are then applied to a large volume of API documents in popular DL libraries to extract their input parameter constraints. To demonstrate the effectiveness of the extracted constraints, DocTer uses the constraints to enable the automatic generation of valid and invalid inputs to test DL API functions. Our evaluation on three popular DL libraries (TensorFlow, PyTorch, and MXNet) shows that the precision of DocTer in extracting input constraints is 85.4%. DocTer detects 94 bugs from 174 API functions, including one previously unknown security vulnerability that is now documented in the CVE database, while a baseline technique without input constraints detects only 59 bugs. Most (63) of the 94 bugs are previously unknown, 54 of which have been fixed or confirmed by developers after we report them. In addition, DocTer detects 43 inconsistencies in documents, 39 of which are fixed or confirmed.
翻译:输入限制对许多软件开发任务非常有用。例如,函数的输入限制可以生成有效的输入,即遵循这些限制的输入,以更深入地测试函数。深度学习(DL)库的 API 函数具有 DL 特定的输入限制,这些限制在自由形式的 API 文档中以非正式的方式描述。现有的限制提取技术无法提取 DL 特定的输入限制。为了填补这一空白,我们设计和实现了一种新技术,即 DocTer,以分析 API 文档以提取 DL API 函数的特定输入限制。 DocTer 采用一种新颖的算法,自动构建规则以从 API 描述的语法模式中提取 API 参数限制,形式为 API 描述的依赖解析树。然后,将这些规则应用于流行的 DL 库中大量的 API 文档,以提取它们的输入参数限制。为了展示提取条件的有效性,DocTer 使用这些限制来生成有效和无效的输入以测试 DL API 函数。我们在三个流行的 DL 库(TensorFlow、PyTorch 和 MXNet)上进行了评估,结果显示 DocTer 提取输入限制的精度为 85.4%。 DocTer 从 174 个 API 函数中检测到 94 个漏洞,包括一个以前未知的安全漏洞,现在记录在 CVE 数据库中,而没有输入限制的基线技术仅检测到 59 个漏洞。其中大多数(63 个)是以前未知的,其中 54 个已经在我们报告后被修复或确认。此外, DocTer 检测到 43 个文档不一致之处,其中 39 个已经被修复或确认。