Assured AI in unrestricted settings is a critical problem. Our framework addresses AI assurance challenges lying at the intersection of domain adaptation, fairness, and counterfactuals analysis, operating via the discovery and intervention on factors of variations in data (e.g. weather or illumination conditions) that significantly affect the robustness of AI models. Robustness is understood here as insensitivity of the model performance to variations in sensitive factors. Sensitive factors are traditionally set in a supervised setting, whereby factors are known a-priori (e.g. for fairness this could be factors like sex or race). In contrast, our motivation is real-life scenarios where less, or nothing, is actually known a-priori about certain factors that cause models to fail. This leads us to consider various settings (unsupervised, domain generalization, semi-supervised) that correspond to different degrees of incomplete knowledge about those factors. Therefore, our two step approach works by a) discovering sensitive factors that cause AI systems to fail in a unsupervised fashion, and then b) intervening models to lessen these factor's influence. Our method considers 3 interventions consisting of Augmentation, Coherence, and Adversarial Interventions (ACAI). We demonstrate the ability for interventions on discovered/source factors to generalize to target/real factors. We also demonstrate how adaptation to real factors of variations can be performed in the semi-supervised case where some target factor labels are known, via automated intervention selection. Experiments show that our approach improves on baseline models, with regard to achieving optimal utility vs. sensitivity/robustness tradeoffs.
翻译:无限制环境中的可靠大赦国际是一个关键问题。我们的框架解决了在领域适应、公平和反事实分析交汇处的大赦国际保证挑战,通过发现和干预严重影响AI模型稳健性的数据变化因素(如天气或照明条件),从而大大影响AI模型的稳健性。这里的强健性被理解为模型业绩对敏感因素的变化不敏感。敏感因素传统上是在受监督的环境中设定的,其因素是已知的优先因素(如公平性,这可以是性别或种族等敏感因素)。相反,我们的动机是真实生活情景,在这些情景中,不甚或根本不了解导致模型失败的某些因素。这导致我们考虑各种环境(不受监督的、广度或半超强的),这些环境与对这些因素的不完全了解程度相对应。因此,我们的两个步骤方法通过一个方法发现导致AI系统以不稳妥的方式失败的敏感因素(如性别或种族等),然后b)干预模型,以减少这些因素的影响。我们的方法认为,3的干预措施包括增强性、一致性和半超强度,还表明我们所发现的基准因素。