We develop KnowThyself, an agentic assistant that advances large language model (LLM) interpretability. Existing tools provide useful insights but remain fragmented and code-intensive. KnowThyself consolidates these capabilities into a chat-based interface, where users can upload models, pose natural language questions, and obtain interactive visualizations with guided explanations. At its core, an orchestrator LLM first reformulates user queries, an agent router further directs them to specialized modules, and the outputs are finally contextualized into coherent explanations. This design lowers technical barriers and provides an extensible platform for LLM inspection. By embedding the whole process into a conversational workflow, KnowThyself offers a robust foundation for accessible LLM interpretability.
翻译:我们开发了KnowThyself,这是一个推进大语言模型(LLM)可解释性的智能助手。现有工具虽能提供有价值的见解,但仍存在功能分散和代码依赖性强的问题。KnowThyself将这些能力整合至基于聊天的交互界面中,用户可上传模型、提出自然语言问题,并通过交互式可视化获得带引导性说明的结果。其核心架构包含:首先由协调器LLM对用户查询进行重构,随后智能体路由器将其定向至专用模块处理,最终将输出结果整合为连贯的解释。该设计降低了技术门槛,并为LLM检测提供了可扩展平台。通过将全流程嵌入对话式工作流,KnowThyself为构建易用的大语言模型可解释性工具奠定了坚实基础。