从准确度到影响力：面向变革理论的工程架构对齐——影响力驱动人工智能框架（IDAIF） (From Accuracy to Impact: The Impact-Driven AI Framework (IDAIF) for Aligning Engineering Architecture with Theory of Change)

This paper introduces the Impact-Driven AI Framework (IDAIF), a novel architectural methodology that integrates Theory of Change (ToC) principles with modern artificial intelligence system design. As AI systems increasingly influence high-stakes domains including healthcare, finance, and public policy, the alignment problem--ensuring AI behavior corresponds with human values and intentions--has become critical. Current approaches predominantly optimize technical performance metrics while neglecting the sociotechnical dimensions of AI deployment. IDAIF addresses this gap by establishing a systematic mapping between ToC's five-stage model (Inputs-Activities-Outputs-Outcomes-Impact) and corresponding AI architectural layers (Data Layer-Pipeline Layer-Inference Layer-Agentic Layer-Normative Layer). Each layer incorporates rigorous theoretical foundations: multi-objective Pareto optimization for value alignment, hierarchical multi-agent orchestration for outcome achievement, causal directed acyclic graphs (DAGs) for hallucination mitigation, and adversarial debiasing with Reinforcement Learning from Human Feedback (RLHF) for fairness assurance. We provide formal mathematical formulations for each component and introduce an Assurance Layer that manages assumption failures through guardian architectures. Three case studies demonstrate IDAIF application across healthcare, cybersecurity, and software engineering domains. This framework represents a paradigm shift from model-centric to impact-centric AI development, providing engineers with concrete architectural patterns for building ethical, trustworthy, and socially beneficial AI systems.

翻译：本文提出影响力驱动人工智能框架（IDAIF），这是一种将变革理论（ToC）原则与现代人工智能系统设计相结合的新型架构方法。随着人工智能系统日益影响医疗、金融和公共政策等高风险领域，对齐问题——确保人工智能行为与人类价值观及意图相一致——变得至关重要。当前方法主要优化技术性能指标，却忽视了人工智能部署的社会技术维度。IDAIF通过建立变革理论五阶段模型（投入-活动-产出-成果-影响力）与对应人工智能架构层（数据层-管道层-推理层-代理层-规范层）之间的系统映射来弥补这一差距。每层均融入严谨的理论基础：采用多目标帕累托优化实现价值对齐，通过分层多智能体编排达成成果目标，利用因果有向无环图（DAG）缓解幻觉现象，结合基于人类反馈的强化学习（RLHF）进行对抗性去偏以保障公平性。我们为各组件提供形式化数学表述，并引入通过守护者架构管理假设失效的保障层。三个案例研究展示了IDAIF在医疗、网络安全和软件工程领域的应用。该框架标志着人工智能开发从以模型为中心向以影响力为中心的范式转变，为工程师构建符合伦理、可信赖且具有社会效益的人工智能系统提供了具体架构模式。

相关内容

关注 7072

人工智能杂志AI(Artificial Intelligence)是目前公认的发表该领域最新研究成果的主要国际论坛。该期刊欢迎有关AI广泛方面的论文，这些论文构成了整个领域的进步，也欢迎介绍人工智能应用的论文，但重点应该放在新的和新颖的人工智能方法如何提高应用领域的性能，而不是介绍传统人工智能方法的另一个应用。关于应用的论文应该描述一个原则性的解决方案，强调其新颖性，并对正在开发的人工智能技术进行深入的评估。官网地址：http://dblp.uni-trier.de/db/journals/ai/

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日