Text2Graph：结合轻量级LLM与GNN在标签稀缺场景下的高效文本分类 (Text2Graph: Combining Lightweight LLMs and GNNs for Efficient Text Classification in Label-Scarce Scenarios)

Large Language Models (LLMs) have become effective zero-shot classifiers, but their high computational requirements and environmental costs limit their practicality for large-scale annotation in high-performance computing (HPC) environments. To support more sustainable workflows, we present Text2Graph, an open-source Python package that provides a modular implementation of existing text-to-graph classification approaches. The framework enables users to combine LLM-based partial annotation with Graph Neural Network (GNN) label propagation in a flexible manner, making it straightforward to swap components such as feature extractors, edge construction methods, and sampling strategies. We benchmark Text2Graph on a zero-shot setting using five datasets spanning topic classification and sentiment analysis tasks, comparing multiple variants against other zero-shot approaches for text classification. In addition to reporting performance, we provide detailed estimates of energy consumption and carbon emissions, showing that graph-based propagation achieves competitive results at a fraction of the energy and environmental cost.

翻译：大型语言模型（LLM）已成为有效的零样本分类器，但其高计算需求与环境成本限制了其在高性能计算（HPC）环境中进行大规模标注的实用性。为支持更可持续的工作流程，我们提出了Text2Graph，这是一个开源Python包，提供了现有文本到图分类方法的模块化实现。该框架使用户能够灵活地将基于LLM的部分标注与图神经网络（GNN）标签传播相结合，并可轻松替换特征提取器、边构建方法和采样策略等组件。我们在零样本设置下使用涵盖主题分类和情感分析任务的五个数据集对Text2Graph进行基准测试，比较了多种变体与其他零样本文本分类方法。除报告性能外，我们还提供了详细的能耗与碳排放估算，表明基于图的传播方法能以极低的能源和环境成本获得具有竞争力的结果。