Machine learning models present a risk of adversarial attack when deployed in production. Quantifying the contributing factors and uncertainties using empirical measures could assist the industry with assessing the risk of downloading and deploying common model types. This work proposes modifying the traditional Drake Equation's formalism to estimate the number of potentially successful adversarial attacks on a deployed model. The Drake Equation is famously used for parameterizing uncertainties and it has been used in many research fields outside of its original intentions to estimate the number of radio-capable extra-terrestrial civilizations. While previous work has outlined methods for discovering vulnerabilities in public model architectures, the proposed equation seeks to provide a semi-quantitative benchmark for evaluating and estimating the potential risk factors for adversarial attacks.
翻译:利用经验性措施对促成因素和不确定因素进行量化,可有助于该行业评估下载和部署通用型号的风险; 这项工作提议修改传统的德雷克赤道格式主义,以估计对已部署型号可能成功的对抗性攻击次数; 德雷克赤道法被著名地用于确定不确定因素的参数,并用于许多研究领域,而除了其最初的意图外,还用于估计无线电功能外的外来文明的数目; 先前的工作概述了在公共型号结构中发现脆弱性的方法,而拟议的等式则试图为评价和估计对抗性攻击的潜在风险因素提供一个半定量基准。