We present a new type of attack in which source code is maliciously encoded so that it appears different to a compiler and to the human eye. This attack exploits subtleties in text-encoding standards such as Unicode to produce source code whose tokens are logically encoded in a different order from the one in which they are displayed, leading to vulnerabilities that cannot be perceived directly by human code reviewers. 'Trojan Source' attacks, as we call them, pose an immediate threat both to first-party software and of supply-chain compromise across the industry. We present working examples of Trojan-Source attacks in C, C++, C#, JavaScript, Java, Rust, Go, and Python. We propose definitive compiler-level defenses, and describe other mitigating controls that can be deployed in editors, repositories, and build pipelines while compilers are upgraded to block this attack.
翻译:我们展示了一种新型攻击,其中源代码被恶意编码,从而与编译者和人类的眼睛不同。这次攻击利用了Unicode等文本编码标准中的微妙之处,生成源代码,其代号在逻辑上以不同于显示代号的顺序编码,导致脆弱性,而人类代码审评员无法直接看到这些弱点。我们称之为“Trojan源”的攻击,对第一党软件和整个行业供应链的妥协构成了直接威胁。我们在C、C++、C#、JavaScript、Java、Rust、Go和Python等C、C++、C#、C#、JavaScript、Java、Java、Rust、Go和Python等地展示了特源代码。我们提出了明确的编译员级防御,并描述了在编辑、储存库和建造管道时,编译员可以用来阻止这种攻击的其他减轻风险的控制措施。