Reinforcement learning (RL) is a machine learning paradigm where an autonomous agent learns to make an optimal sequence of decisions by interacting with the underlying environment. The promise demonstrated by RL-guided workflows in unraveling electronic design automation problems has encouraged hardware security researchers to utilize autonomous RL agents in solving domain-specific problems. From the perspective of hardware security, such autonomous agents are appealing as they can generate optimal actions in an unknown adversarial environment. On the other hand, the continued globalization of the integrated circuit supply chain has forced chip fabrication to off-shore, untrustworthy entities, leading to increased concerns about the security of the hardware. Furthermore, the unknown adversarial environment and increasing design complexity make it challenging for defenders to detect subtle modifications made by attackers (a.k.a. hardware Trojans). In this brief, we outline the development of RL agents in detecting hardware Trojans, one of the most challenging hardware security problems. Additionally, we outline potential opportunities and enlist the challenges of applying RL to solve hardware security problems.
翻译:强化学习(RL)是一种机器学习模式,自主代理机构通过与基本环境互动,学习如何做出最佳的决策,而自主代理机构在解开电子设计自动化问题时所表现出的希望,促使硬件安全研究人员利用自主RL代理机构解决特定领域的问题。从硬件安全的角度来说,这种自主代理机构具有吸引力,因为它们可以在未知的对抗环境中产生最佳行动。另一方面,一体化电路供应链的持续全球化迫使芯片制造到离岸、不可靠的实体,导致对硬件安全的日益关切。此外,未知的对抗环境和日益复杂的设计使得维权者难以发现攻击者(a.k.a.硬件Trojans)所作的微妙的修改。在此概要中,我们概述了在发现硬件Trojans(硬件最具有挑战性的硬件安全问题之一)方面研发RL代理机构的情况。此外,我们概述了利用RL解决硬件安全问题的潜在机会,并提出了挑战。