News text classification is a crucial task in natural language processing, essential for organizing and filtering the massive volume of digital content. Traditional methods typically rely on statistical features like term frequencies or TF-IDF values, which are effective at capturing word-level importance but often fail to reflect contextual meaning. In contrast, modern deep learning approaches utilize semantic features to understand word usage within context, yet they may overlook simple, high-impact statistical indicators. This paper introduces an Attention-Guided Feature Fusion (AGFF) model that combines statistical and semantic features in a unified framework. The model applies an attention-based mechanism to dynamically determine the relative importance of each feature type, enabling more informed classification decisions. Through evaluation on benchmark news datasets, the AGFF model demonstrates superior performance compared to both traditional statistical models and purely semantic deep learning models. The results confirm that strategic integration of diverse feature types can significantly enhance classification accuracy. Additionally, ablation studies validate the contribution of each component in the fusion process. The findings highlight the model's ability to balance and exploit the complementary strengths of statistical and semantic representations, making it a practical and effective solution for real-world news classification tasks.
翻译:新闻文本分类是自然语言处理中的关键任务,对于组织和筛选海量数字内容至关重要。传统方法通常依赖词频或TF-IDF值等统计特征,这些特征能有效捕捉词汇级重要性,但往往无法反映上下文含义。相比之下,现代深度学习方法利用语义特征来理解语境中的词汇使用,但可能忽略简单而高影响力的统计指标。本文提出了一种注意力引导特征融合(AGFF)模型,将统计特征与语义特征结合在统一框架中。该模型采用基于注意力的机制动态确定各类特征的相对重要性,从而做出更明智的分类决策。通过在基准新闻数据集上的评估,AGFF模型相比传统统计模型和纯语义深度学习模型均表现出更优性能。结果证实,对不同特征类型的策略性整合能显著提升分类准确率。此外,消融研究验证了融合过程中各组成部分的贡献。研究结果突显了该模型在平衡并利用统计与语义表征互补优势方面的能力,使其成为实际新闻分类任务中实用且有效的解决方案。