Graph convolutional network (GCN) has been successfully applied to capture global non-consecutive and long-distance semantic information for text classification. However, while GCN-based methods have shown promising results in offline evaluations, they commonly follow a seen-token-seen-document paradigm by constructing a fixed document-token graph and cannot make inferences on new documents. It is a challenge to deploy them in online systems to infer steaming text data. In this work, we present a continual GCN model (ContGCN) to generalize inferences from observed documents to unobserved documents. Concretely, we propose a new all-token-any-document paradigm to dynamically update the document-token graph in every batch during both the training and testing phases of an online system. Moreover, we design an occurrence memory module and a self-supervised contrastive learning objective to update ContGCN in a label-free manner. A 3-month A/B test on Huawei public opinion analysis system shows ContGCN achieves 8.86% performance gain compared with state-of-the-art methods. Offline experiments on five public datasets also show ContGCN can improve inference quality. The source code will be released at https://github.com/Jyonn/ContGCN.
翻译:图卷积网络 (GCN) 已成功地应用于捕捉全局非连续和长距离语义信息以实现文本分类。然而,虽然基于 GCN 的方法在脱机评估中表现出了很大的潜力,但它们通常遵循一个已知标记-已知文档范式,通过构建一个固定的文档-标记图,不能对新文档进行推理。在线系统中将它们部署为推理数据流文本是一个挑战。在这项工作中,我们提出了一个连续 GCN 模型 (ContGCN) 来推广从观察到的文档推理到未观察到的文档。具体而言,我们提出了一个新的所有标记-任何文档范式,在在线系统的训练和测试阶段每个批次中动态更新文档-标记图。此外,我们设计了一个出现记忆模块和一个自监督对比学习目标,以无标签的方式更新 ContGCN。在华为舆情分析系统的三个月 A/B 测试中,ContGCN 相对于最先进的方法获得了 8.86% 的性能提升。在五个公共数据集上的脱机实验也表明,ContGCN 可以提高推理质量。源代码将在 https://github.com/Jyonn/ContGCN 上发布。