Are Capture-the-Flag competitions obsolete? In 2025, Cybersecurity AI (CAI) systematically conquered some of the world's most prestigious hacking competitions, achieving Rank #1 at multiple events and consistently outperforming thousands of human teams. Across five major circuits-HTB's AI vs Humans, Cyber Apocalypse (8,129 teams), Dragos OT CTF, UWSP Pointer Overflow, and the Neurogrid CTF showdown-CAI demonstrated that Jeopardy-style CTFs have become a solved game for well-engineered AI agents. At Neurogrid, CAI captured 41/45 flags to claim the $50,000 top prize; at Dragos OT, it sprinted 37% faster to 10K points than elite human teams; even when deliberately paused mid-competition, it maintained top-tier rankings. Critically, CAI achieved this dominance through our specialized alias1 model architecture, which delivers enterprise-scale AI security operations at unprecedented cost efficiency and with augmented autonomy-reducing 1B token inference costs from $5,940 to just $119, making continuous security agent operation financially viable for the first time. These results force an uncomfortable reckoning: if autonomous agents now dominate competitions designed to identify top security talent at negligible cost, what are CTFs actually measuring? This paper presents comprehensive evidence of AI capability across the 2025 CTF circuit and argues that the security community must urgently transition from Jeopardy-style contests to Attack & Defense formats that genuinely test adaptive reasoning and resilience-capabilities that remain uniquely human, for now.
翻译:夺旗竞赛是否已过时?2025年,网络安全人工智能系统在多项全球顶级黑客竞赛中实现系统性突破,于多场赛事中位列榜首,持续超越数千支人类团队。在五大主要赛事——HTB人机对抗赛、Cyber Apocalypse(8,129支队伍)、Dragos工控系统夺旗赛、UWSP指针溢出赛及Neurogrid夺旗巅峰对决中,CAI证明经典 jeopardy 式夺旗赛对于经过精心设计的AI智能体已成为可解问题。在Neurogrid赛事中,CAI成功夺取45面旗帜中的41面,赢得5万美元最高奖金;在Dragos工控系统赛中,其达到1万积分点的速度比顶尖人类团队快37%;即使在比赛中途被故意暂停,仍能保持顶级排名。关键在于,CAI通过我们专有的alias1模型架构实现这一统治性表现,该架构以前所未有的成本效益和增强的自主性提供企业级AI安全运维——将10亿token的推理成本从5,940美元降至仅119美元,首次使持续运行的安全智能体在经济上具备可行性。这些结果引发了一个令人不安的反思:如果自主智能体能以可忽略的成本主导旨在选拔顶尖安全人才的竞赛,那么夺旗赛实际测量的究竟是什么?本文通过2025年夺旗赛系列中的全面证据展示AI能力,并主张安全社区必须紧急从 jeopardy 式竞赛转向真正考验适应性推理与抗逆能力的攻防对抗模式——这些能力目前仍为人类所独有。