Email is one of the most widely used ways to communicate, with millions of people and businesses relying on it to communicate and share knowledge and information on a daily basis. Nevertheless, the rise in email users has occurred a dramatic increase in spam emails in recent years. Processing and managing emails properly for individuals and companies are getting increasingly difficult. This article proposes a novel technique for email spam detection that is based on a combination of convolutional neural networks, gated recurrent units, and attention mechanisms. During system training, the network is selectively focused on necessary parts of the email text. The usage of convolution layers to extract more meaningful, abstract, and generalizable features by hierarchical representation is the major contribution of this study. Additionally, this contribution incorporates cross-dataset evaluation, which enables the generation of more independent performance results from the model's training dataset. According to cross-dataset evaluation results, the proposed technique advances the results of the present attention-based techniques by utilizing temporal convolutions, which give us more flexible receptive field sizes are utilized. The suggested technique's findings are compared to those of state-of-the-art models and show that our approach outperforms them.
翻译:电子邮件是最广泛使用的通信方式之一,数百万人和企业每天依靠电子邮件进行交流和分享知识和信息,然而,近年来电子邮件用户的增多使垃圾邮件数量急剧增加。为个人和公司妥善处理和管理电子邮件变得越来越困难。本文章建议采用新型的电子邮件垃圾检测技术,该技术以进化神经网络、封闭的经常性单元和关注机制相结合为基础。在系统培训期间,网络有选择地侧重于电子邮件文本的必要部分。使用变迁层提取更有意义、抽象和一般的特征是本研究的主要贡献。此外,这一贡献包括交叉数据集评估,使得能够从模型的培训数据集中产生更独立的性能结果。根据交叉数据集的评估结果,拟议的技术通过利用时间变迁来推进当前以关注为基础的技术的结果,这使我们更灵活地接受字段的大小。建议的技术结论与最新模型的对比,并表明我们的方法超越了这些模型。