Large language models (LLMs) are increasingly deployed locally for privacy and accessibility, yet users lack tools to measure their resource usage, environmental impact, and efficiency metrics. This paper presents EnviroLLM, an open-source toolkit for tracking, benchmarking, and optimizing performance and energy consumption when running LLMs on personal devices. The system provides real-time process monitoring, benchmarking across multiple platforms (Ollama, LM Studio, vLLM, and OpenAI-compatible APIs), persistent storage with visualizations for longitudinal analysis, and personalized model and optimization recommendations. The system includes LLM-as-judge evaluations alongside energy and speed metrics, enabling users to assess quality-efficiency tradeoffs when testing models with custom prompts.
翻译:大型语言模型(LLMs)因隐私和可访问性考虑正日益部署于本地环境,然而用户缺乏测量其资源使用、环境影响及效率指标的工具。本文提出EnviroLLM——一个用于在个人设备上运行LLMs时追踪、基准测试及优化性能与能耗的开源工具包。该系统提供实时进程监控、跨多平台(Ollama、LM Studio、vLLM及OpenAI兼容API)的基准测试、支持纵向分析的可视化持久化存储,以及个性化的模型与优化建议。系统整合了LLM-as-judge评估与能耗速度指标,使用户在通过自定义提示测试模型时能够评估质量-效率的权衡关系。