APIs often transmit far more data to client applications than they need, and in the context of web applications, often do so over public channels. This issue, termed Excessive Data Exposure (EDE), was OWASP's third most significant API vulnerability of 2019. However, there are few automated tools -- either in research or industry -- to effectively find and remediate such issues. This is unsurprising as the problem lacks an explicit test oracle: the vulnerability does not manifest through explicit abnormal behaviours (e.g., program crashes or memory access violations). In this work, we develop a metamorphic relation to tackle that challenge and build the first fuzzing tool -- that we call EDEFuzz -- to systematically detect EDEs. EDEFuzz can significantly reduce false negatives that occur during manual inspection and ad-hoc text-matching techniques, the current most-used approaches. We tested EDEFuzz against the sixty-nine applicable targets from the Alexa Top-200 and found 33,365 potential leaks -- illustrating our tool's broad applicability and scalability. In a more-tightly controlled experiment of eight popular websites in Australia, EDEFuzz achieved a high true positive rate of 98.65% with minimal configuration, illustrating our tool's accuracy and efficiency.
翻译:API通常向客户应用程序传送远多于其需要的数据,在网络应用程序中,往往通过公共渠道传递。这个问题被称为过度数据曝光(EDE),是OWASP在2019年中第三大API脆弱性最大的一个问题。然而,在研究或行业中,很少有自动工具能够有效发现和补救这类问题,因为问题缺乏明确的测试或奇迹:脆弱性并不通过明显的异常行为(例如程序崩溃或记忆存取访问)表现出来。在这项工作中,我们发展了应对这一挑战的变形关系,并建立了第一个模糊工具 -- -- 我们称之为EDEFuzz -- -- 以系统检测EDEFUzz。EDEFuzz可以大量减少在手动检查和对文本匹配技术(目前最常用的方法)期间出现的虚假负面数据。我们测试了EDEFUzzzz, 与Alexa Top-200的69个适用目标(我们测试了33,365个潜在泄漏)没有显示出我们的工具的广泛适用性和可缩略性。在澳大利亚8个网站实现的精确度试验中,一个更精确的精确度是真实的精确度。