We introduce an Analyze-Revise-Finetune (ARF) pipeline that enables smaller open-source language models (LLMs) to surpass substantially larger proprietary models in customer service summarization tasks. The pipeline first analyzes and categorizes common errors in summaries produced by a teacher model (GPT-3.5), then performs a targeted revision using a compact editor model (Llama 3.1 70B) to generate high-quality, refined training data. Fine-tuning a smaller student model (Llama 3.1 8B) on this refined data resulted in superior summarization performance compared to GPT-3.5. The ARF pipeline improves cost efficiency and data privacy while maintaining competitive accuracy, illustrating a generalizable framework for enhancing open-source LLMs across diverse downstream applications.
翻译:我们提出了一种分析-修订-微调(ARF)流程,使较小的开源语言模型(LLMs)在客户服务摘要任务中超越显著更大的专有模型。该流程首先分析并分类教师模型(GPT-3.5)生成的摘要中的常见错误,然后使用紧凑的编辑模型(Llama 3.1 70B)进行定向修订,以生成高质量、精炼的训练数据。在此精炼数据上微调较小的学生模型(Llama 3.1 8B)后,其摘要性能优于GPT-3.5。ARF流程提高了成本效益和数据隐私性,同时保持了有竞争力的准确性,展示了一个可推广的框架,用于增强开源LLMs在多样化下游应用中的表现。