The paper describes a Multisource AI Scorecard Table (MAST) that provides the developer and user of an artificial intelligence (AI)/machine learning (ML) system with a standard checklist focused on the principles of good analysis adopted by the intelligence community (IC) to help promote the development of more understandable systems and engender trust in AI outputs. Such a scorecard enables a transparent, consistent, and meaningful understanding of AI tools applied for commercial and government use. A standard is built on compliance and agreement through policy, which requires buy-in from the stakeholders. While consistency for testing might only exist across a standard data set, the community requires discussion on verification and validation approaches which can lead to interpretability, explainability, and proper use. The paper explores how the analytic tradecraft standards outlined in Intelligence Community Directive (ICD) 203 can provide a framework for assessing the performance of an AI system supporting various operational needs. These include sourcing, uncertainty, consistency, accuracy, and visualization. Three use cases are presented as notional examples that support security for comparative analysis.
翻译:该文件介绍了一个多来源的AI计分卡表,该表向人工智能/机器学习系统的开发者和用户提供了标准核对清单,该核对表侧重于情报界(IC)通过的良好分析原则,以帮助促进建立更易理解的系统,建立对AI产出的信任。这种记分卡有助于透明、一致和有意义地理解适用于商业和政府使用的AI工具。标准建立在通过政策遵守和协议的基础上,这需要利益攸关方的认同。虽然测试的一致性可能仅在标准数据集中存在,但社区需要讨论核查和验证方法,这可以导致解释、解释和适当使用。该文件探讨了情报界指令(ICD) 203中概述的分析性贸易标准如何为评估支持各种业务需求的AI系统绩效提供一个框架,其中包括来源、不确定性、一致性、准确性和直观性。三个使用案例作为支持比较分析安全的概念性实例提出。