Auditing trained deep learning (DL) models prior to deployment is vital in preventing unintended consequences. One of the biggest challenges in auditing is in understanding how we can obtain human-interpretable specifications that are directly useful to the end-user. We address this challenge through a sequence of semantically-aligned unit tests, where each unit test verifies whether a predefined specification (e.g., accuracy over 95%) is satisfied with respect to controlled and semantically aligned variations in the input space (e.g., in face recognition, the angle relative to the camera). We perform these unit tests by directly verifying the semantically aligned variations in an interpretable latent space of a generative model. Our framework, AuditAI, bridges the gap between interpretable formal verification and scalability. With evaluations on four different datasets, covering images of towers, chest X-rays, human faces, and ImageNet classes, we show how AuditAI allows us to obtain controlled variations for verification and certified training while addressing the limitations of verifying using only pixel-space perturbations. A blog post accompanying the paper is at this link https://developer.nvidia.com/blog/nvidia-research-auditing-ai-models-for-verified-deployment-under-semantic-specifications
翻译:在部署之前经过深层次培训的(DL)审计模型对于防止意外后果至关重要。审计的最大挑战之一是了解我们如何能够获得对最终用户直接有用的人类解释规格。我们通过一系列精密一致的单位测试来应对这一挑战,每个单位测试都通过一系列精密一致的单位测试来核查预先定义规格(例如,精确度超过95%)是否对控制并具有内在一致性的输入空间的变化感到满意(例如,面对面识别,与相机相对的角度)。我们进行这些单位测试的方式是直接核查可解释的基因化模型潜在空间的语义一致变异。我们的框架,审计AI,缩小了可解释的正式核查与可缩放之间的鸿沟。随着对四个不同数据集的评估,包括塔的图像、胸部X光、人脸和图像网络等,我们展示审计AI如何使我们能够在核查和认证培训方面获得受控变异,同时解决仅使用像素-空间透视的局限性。与可解释的基因化模型相比,一个与文件配套的博客后,正在链接 https-develop-taribal-taimation-taribal-tail-tail-taribal-tail-tain-tailder-tailder-tailder-taild