As users increasingly turn to large language model (LLM) based web agents to automate online tasks, agents may encounter dark patterns: deceptive user interface designs that manipulate users into making unintended decisions. Although dark patterns primarily target human users, their potentially harmful impacts on LLM-based generalist web agents remain unexplored. In this paper, we present the first study that investigates the impact of dark patterns on the decision-making process of LLM-based generalist web agents. To achieve this, we introduce LiteAgent, a lightweight framework that automatically prompts agents to execute tasks while capturing comprehensive logs and screen-recordings of their interactions. We also present TrickyArena, a controlled environment comprising web applications from domains such as e-commerce, streaming services, and news platforms, each containing diverse and realistic dark patterns that can be selectively enabled or disabled. Using LiteAgent and TrickyArena, we conduct multiple experiments to assess the impact of both individual and combined dark patterns on web agent behavior. We evaluate six popular LLM-based generalist web agents across three LLMs and discover that when there is a single dark pattern present, agents are susceptible to it an average of 41% of the time. We also find that modifying dark pattern UI attributes through visual design changes or HTML code adjustments and introducing multiple dark patterns simultaneously can influence agent susceptibility. This study emphasizes the need for holistic defense mechanisms in web agents, encompassing both agent-specific protections and broader web safety measures.
翻译:随着用户日益依赖基于大语言模型(LLM)的网络代理来自动化在线任务,这些代理可能会遭遇暗黑模式:即通过欺骗性用户界面设计操纵用户做出非本意决策的手段。尽管暗黑模式主要针对人类用户,但其对基于LLM的通用网络代理的潜在危害尚未得到充分探索。本文首次研究了暗黑模式对基于LLM的通用网络代理决策过程的影响。为此,我们提出了LiteAgent——一个轻量级框架,可自动提示代理执行任务,同时全面记录其交互日志与屏幕录像。我们还构建了TrickyArena——一个包含电子商务、流媒体服务和新闻平台等领域网页应用的受控环境,每个应用均包含多样且真实的暗黑模式,并可选择性地启用或禁用。借助LiteAgent与TrickyArena,我们通过多组实验评估了单个及组合暗黑模式对网络代理行为的影响。我们在三种LLM上评估了六种主流的基于LLM的通用网络代理,发现当存在单一暗黑模式时,代理平均有41%的概率会受到其影响。研究还表明,通过视觉设计调整或HTML代码修改来改变暗黑模式的UI属性,以及同时引入多个暗黑模式,均会影响代理的受操纵程度。本研究强调网络代理需要建立包含代理专属防护与更广泛网络安全措施在内的整体防御机制。