\textbf{Background:} Testing and validation of the semantic correctness of patches provided by tools for Automated Program Repairs (APR) has received a lot of attention. Yet, the eventual acceptance or rejection of suggested patches for real world projects by humans patch reviewers has received a limited attention.\\ \textbf{Objective:} To address this issue, we plan to investigate whether (possibly incorrect) security patches suggested by APR tools are recognized by human reviewers. We also want to investigate whether knowing that a patch was produced by an allegedly specialized tool does change the decision of human reviewers. \\ \textbf{Method:} In the first phase, using a balanced design, we propose to human reviewers a combination of patches proposed by APR tools for different vulnerabilities and ask reviewers to adopt or reject the proposed patches. In the second phase, we tell participants that some of the proposed patches were generated by security specialized tools (even if the tool was actually a `normal' APR tool) and measure whether the human reviewers would change their decision to adopt or reject a patch.\\ \textbf{Limitations:} The experiment will be conducted in an academic setting, and to maintain power, it will focus on a limited sample of popular APR tools and popular vulnerability types.
翻译:\ textbf{ background:} 测试和验证自动程序维修工具提供的补丁的语义正确性引起了人们的极大关注。 然而, 人类补丁审评员最终接受或拒绝真实世界项目的拟议补丁的推荐补丁受到的关注有限。\\\ textbf{ 目标 :} 为解决这一问题, 我们计划调查( 可能不正确) 同行审议员是否承认了同行审议员工具建议的安全补丁。 我们还想调查知道据称专门工具产生的补丁确实改变了人类审查员的决定。\\\ textbf{Method:} 在第一阶段, 使用平衡的设计, 我们建议人类审查员结合由同行审议员为不同脆弱性提议的补丁, 并要求审评员采纳或拒绝提议的补丁。 在第二阶段, 我们告诉与会者, 拟议的部分补丁是由安全专门工具产生的( 即使该工具实际上是一个“ 正常的” 同行审议组工具), 并测量人类审查员是否会改变其决定, 采用或拒绝一个大众化工具的补丁:\\ textbroflate to be a changd to a lishal finds to be a pop violviolvical folview: to be a violviolview.