AI agents have rapidly gained popularity across research and industry as systems that extend large language models with additional capabilities to plan, use tools, remember, and act toward specific goals. Yet despite their promise, developers face persistent and often underexplored challenges when building, deploying, and maintaining these emerging systems. To identify these challenges, we study developer discussions on Stack Overflow, the world's largest developer-focused Q and A platform with about 60 million questions and answers and 30 million users. We construct a taxonomy of developer challenges through tag expansion and filtering, apply LDA-MALLET for topic modeling, and manually validate and label the resulting themes. Our analysis reveals seven major areas of recurring issues encompassing 77 distinct technical challenges related to runtime integration, dependency management, orchestration complexity, and evaluation reliability. We further quantify topic popularity and difficulty to identify which issues are most common and hardest to resolve, map the tools and programming languages used in agent development, and track their evolution from 2021 to 2025 in relation to major AI model and framework releases. Finally, we present the implications of our results, offering concrete guidance for practitioners, researchers, and educators on agent reliability and developer support.
翻译:AI智能体作为扩展大语言模型能力、使其具备规划、工具调用、记忆存储及面向特定目标行动能力的系统,已在学术界与工业界迅速普及。然而,尽管前景广阔,开发者在构建、部署和维护这类新兴系统时仍面临持续存在且常被忽视的挑战。为识别这些挑战,本研究分析了全球最大的开发者问答平台Stack Overflow上的讨论,该平台拥有约6000万条问答记录和3000万用户。我们通过标签扩展与筛选构建了开发者挑战的分类体系,应用LDA-MALLET进行主题建模,并对生成的主题进行人工验证与标注。分析揭示了七个主要问题领域,涵盖77项具体技术挑战,涉及运行时集成、依赖管理、编排复杂性和评估可靠性等方面。我们进一步量化了主题热度与解决难度,以识别最常见且最难解决的问题,绘制了智能体开发中使用的工具与编程语言图谱,并追踪了2021年至2025年间这些挑战随主流AI模型与框架发布的演变趋势。最后,我们阐述了研究结果的实践意义,为从业者、研究人员和教育工作者在智能体可靠性与开发者支持方面提供了具体指导。