Industry practitioners always face the problem of choosing the appropriate model for deployment under different considerations, such as to maximize a metric that is crucial for production, or to reduce the total cost given financial concerns. In this work, we focus on the text classification task and present a quantitative analysis for this challenge. Using classification accuracy as the main metric, we evaluate the classifiers' performances for a variety of models, including large language models, along with their associated costs, including the annotation cost, training (fine-tuning) cost, and inference cost. We then discuss the model choices for situations like having a large number of samples needed for inference. We hope our work will help people better understand the cost/quality trade-offs for the text classification task.
翻译:工业从业人员总是面临在不同的考虑下选择适当的部署模式的问题,例如最大限度地提高生产关键指标,或减少总成本。在这项工作中,我们把重点放在文本分类任务上,并针对这一挑战提出定量分析。我们以分类准确性为主要指标,评估分类人员在各种模型方面的表现,包括大语言模型及其相关成本,包括注释成本、培训(调整)成本和推论成本。然后,我们讨论对诸如有大量样本作为推断所需的情形的模型选择。我们希望我们的工作将有助于人们更好地了解文本分类任务的成本/质量权衡。