Securing blockchain-enabled IoT networks against sophisticated adversarial attacks remains a critical challenge. This paper presents a trust-based delegated consensus framework integrating Fully Homomorphic Encryption (FHE) with Attribute-Based Access Control (ABAC) for privacy-preserving policy evaluation, combined with learning-based defense mechanisms. We systematically compare three reinforcement learning approaches -- tabular Q-learning (RL), Deep RL with Dueling Double DQN (DRL), and Multi-Agent RL (MARL) -- against five distinct attack families: Naive Malicious Attack (NMA), Collusive Rumor Attack (CRA), Adaptive Adversarial Attack (AAA), Byzantine Fault Injection (BFI), and Time-Delayed Poisoning (TDP). Experimental results on a 16-node simulated IoT network reveal significant performance variations: MARL achieves superior detection under collusive attacks (F1=0.85 vs. DRL's 0.68 and RL's 0.50), while DRL and MARL both attain perfect detection (F1=1.00) against adaptive attacks where RL fails (F1=0.50). All agents successfully defend against Byzantine attacks (F1=1.00). Most critically, the Time-Delayed Poisoning attack proves catastrophic for all agents, with F1 scores dropping to 0.11-0.16 after sleeper activation, demonstrating the severe threat posed by trust-building adversaries. Our findings indicate that coordinated multi-agent learning provides measurable advantages for defending against sophisticated trust manipulation attacks in blockchain IoT environments.
翻译:保障支持区块链的物联网网络免受复杂对抗攻击仍是一项关键挑战。本文提出一种基于信任的委托共识框架,该框架将全同态加密与基于属性的访问控制相结合,以实现隐私保护策略评估,并与基于学习的防御机制相集成。我们系统比较了三种强化学习方法——表格Q学习、采用Dueling Double DQN的深度强化学习以及多智能体强化学习——在应对五种不同攻击家族时的表现:简单恶意攻击、合谋谣言攻击、自适应对抗攻击、拜占庭故障注入攻击以及时间延迟毒化攻击。在一个16节点的模拟物联网网络上的实验结果表明了显著的性能差异:MARL在合谋攻击下实现了最优检测性能(F1=0.85,而DRL为0.68,RL为0.50);在面对自适应攻击时,DRL和MARL均实现了完美检测(F1=1.00),而RL则失效(F1=0.50)。所有智能体均成功防御了拜占庭攻击(F1=1.00)。最关键的是,时间延迟毒化攻击对所有智能体均造成了灾难性后果,在潜伏攻击激活后,F1分数降至0.11-0.16,这证明了通过建立信任进行攻击的对手所构成的严重威胁。我们的研究结果表明,协调的多智能体学习为防御区块链物联网环境中的复杂信任操纵攻击提供了可衡量的优势。