垃圾邮件检测中大型语言模型的基准测试：Spam-T5 (Spam-T5: Benchmarking Large Language Models for Few-Shot Email Spam Detection) - 专知论文

会员服务 ·

0

垃圾邮件检测 · 大型语言模型 · T5 · 语言模型 · 基线 ·

2023 年 4 月 5 日

Spam-T5: Benchmarking Large Language Models for Few-Shot Email Spam Detection

翻译：垃圾邮件检测中大型语言模型的基准测试：Spam-T5

Maxime Labonne,Sean Moran

This paper investigates the effectiveness of large language models (LLMs) in email spam detection by comparing prominent models from three distinct families: BERT-like, Sentence Transformers, and Seq2Seq. Additionally, we examine well-established machine learning techniques for spam detection, such as Na\"ive Bayes and LightGBM, as baseline methods. We assess the performance of these models across four public datasets, utilizing different numbers of training samples (full training set and few-shot settings). Our findings reveal that, in the majority of cases, LLMs surpass the performance of the popular baseline techniques, particularly in few-shot scenarios. This adaptability renders LLMs uniquely suited to spam detection tasks, where labeled samples are limited in number and models require frequent updates. Additionally, we introduce Spam-T5, a Flan-T5 model that has been specifically adapted and fine-tuned for the purpose of detecting email spam. Our results demonstrate that Spam-T5 surpasses baseline models and other LLMs in the majority of scenarios, particularly when there are a limited number of training samples available. Our code is publicly available at https://github.com/jpmorganchase/emailspamdetection.

翻译：本文通过比较BERT类、Sentence Transformer及Seq2Seq等三个主要模型家族的显著模型，以及Naïve Bayes和LightGBM等已建立的垃圾邮件检测机器学习技术作为基线方法，考察了大型语言模型在邮件垃圾检测中的有效性。我们利用四个公共数据集评估这些模型的性能，在使用不同数量的训练样本（完整训练集和少量样本）的情况下评估其性能。研究结果表明，在大多数情况下，大型语言模型超过了流行的基线技术的表现，尤其是在少量样本的情况下。这种适应性使得大型语言模型非常适合垃圾邮件检测任务，其中标记样本数量有限，而模型需要频繁更新。此外，我们介绍了Spam-T5，这是一个专门适配并微调用于检测电子邮件垃圾的Flan-T5模型。我们的研究结果显示，在大多数情况下，Spam-T5超越了基线模型和其他大型语言模型，特别是当有限数量的训练样本可用时。我们公开发布了代码，网址为https://github.com/jpmorganchase/emailspamdetection。

0

相关内容

垃圾邮件检测

垃圾邮件检测

【ICML2023】调整语言模型作为增强少样本学习的训练数据生成器

【ICML2023】调整语言模型作为增强少样本学习的训练数据生成器

专知会员服务

32+阅读 · 2023年5月19日

百篇论文纵览大型语言模型最新研究进展

百篇论文纵览大型语言模型最新研究进展

专知会员服务

70+阅读 · 2023年3月31日

【ICLR2022】Transformers亦能贝叶斯推断

【ICLR2022】Transformers亦能贝叶斯推断

专知会员服务

25+阅读 · 2021年12月23日

预训练语言模型fine-tuning近期进展概述

预训练语言模型fine-tuning近期进展概述

专知会员服务

40+阅读 · 2021年4月9日

【GPT-3作者亲解】超大型语言模型少样本学习，109页ppt

【GPT-3作者亲解】超大型语言模型少样本学习，109页ppt

专知会员服务

109+阅读 · 2020年12月19日

【ICML2020】文本摘要生成模型PEGASUS

【ICML2020】文本摘要生成模型PEGASUS

专知会员服务

35+阅读 · 2020年8月23日

【微软】大型神经语言模型的对抗性训练，Adversarial Training for Large Neural Language Models

【微软】大型神经语言模型的对抗性训练，Adversarial Training for Large Neural Language Models

专知会员服务

51+阅读 · 2020年5月3日

【Amazon】使用预先训练的Transformer模型进行数据增强，Data Augmentation using Pre-trained Transformer Models

【Amazon】使用预先训练的Transformer模型进行数据增强，Data Augmentation using Pre-trained Transformer Models

专知会员服务

51+阅读 · 2020年3月7日

【微软雷德蒙研究院】小样本自然语言生成，Few-shot Natural Language Generation for Task-Oriented Dialog

【微软雷德蒙研究院】小样本自然语言生成，Few-shot Natural Language Generation for Task-Oriented Dialog

专知会员服务

33+阅读 · 2020年2月29日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

首次：微软用GPT-4做大模型指令微调，新任务零样本性能再提升

首次：微软用GPT-4做大模型指令微调，新任务零样本性能再提升

机器之心

7+阅读 · 2023年4月9日

ICLR'23上的高分GNN论文. 来看看你的投稿得分是啥水平吧

ICLR'23上的高分GNN论文. 来看看你的投稿得分是啥水平吧

图与推荐

4+阅读 · 2022年11月9日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

RoBERTa中文预训练模型：RoBERTa for Chinese

RoBERTa中文预训练模型：RoBERTa for Chinese

PaperWeekly

57+阅读 · 2019年9月16日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

年度必读：2018最具突破性人工智能论文Top 10

年度必读：2018最具突破性人工智能论文Top 10

机器学习算法与Python学习

11+阅读 · 2018年12月2日

谷歌发表的史上最强NLP模型BERT的官方代码和预训练模型可以下载了

谷歌发表的史上最强NLP模型BERT的官方代码和预训练模型可以下载了

AINLP

12+阅读 · 2018年11月1日

【论文推荐】最新十篇目标跟踪相关论文—多帧光流跟踪、动态图学习、MV-YOLO、姿态估计、深度核相关滤波、Benchmark

【论文推荐】最新十篇目标跟踪相关论文—多帧光流跟踪、动态图学习、MV-YOLO、姿态估计、深度核相关滤波、Benchmark

专知

13+阅读 · 2018年5月26日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【推荐】深度学习目标检测全面综述

【推荐】深度学习目标检测全面综述

机器学习研究会

21+阅读 · 2017年9月13日

ARB抑制miR-193a表达促进早期糖尿病肾病壁层上皮细胞-足细胞转分化研究

国家自然科学基金

0+阅读 · 2015年12月31日

肺炎链球菌疫苗SPY1的一种免疫保护机制：TGF-β信号通路介导Treg细胞参与保护性免疫

国家自然科学基金

0+阅读 · 2015年12月31日

近地空间环境下含Sc铝合金的高速撞击特性研究及可靠性评估

国家自然科学基金

0+阅读 · 2014年12月31日

面向大数据的渐进式集成学习方法与分布式算法研究

国家自然科学基金

2+阅读 · 2014年12月31日

机器翻译中大规模异类特征的迁移学习

国家自然科学基金

2+阅读 · 2013年12月31日

大规模高分辨质谱数据挖掘新方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

一维核（Si、Ge）/壳（碳）结构多孔纳米线、纳米管的可控制备以及高性能储锂研究

国家自然科学基金

0+阅读 · 2012年12月31日

医学图像的高容量及鲁棒可逆水印的研究

国家自然科学基金

1+阅读 · 2012年12月31日

髓系抑制性细胞（MDSC）参与鼻咽癌免疫耐受的作用和调控机制

国家自然科学基金

0+阅读 · 2012年12月31日

非参数CFAR检测理论及应用

国家自然科学基金

0+阅读 · 2011年12月31日

WYWEB: A NLP Evaluation Benchmark For Classical Chinese

Arxiv

0+阅读 · 2023年5月23日

Evaluating ChatGPT's Performance for Multilingual and Emoji-based Hate Speech Detection

Arxiv

0+阅读 · 2023年5月23日

Language-Agnostic Bias Detection in Language Models

Arxiv

0+阅读 · 2023年5月22日

Rethinking the Evaluation for Conversational Recommendation in the Era of Large Language Models

Arxiv

1+阅读 · 2023年5月22日

Are Large Language Models Good Evaluators for Abstractive Summarization?

Arxiv

0+阅读 · 2023年5月22日

DUMB: A Benchmark for Smart Evaluation of Dutch Models

Arxiv

0+阅读 · 2023年5月22日

CodeT5+: Open Code Large Language Models for Code Understanding and Generation

CodeT5+: Open Code Large Language Models for Code Understanding and Generation

Arxiv

0+阅读 · 2023年5月20日

Controlling the Extraction of Memorized Data from Large Language Models via Prompt-Tuning

Arxiv

0+阅读 · 2023年5月19日

HELMA: A Large-Scale Hallucination Evaluation Benchmark for Large Language Models

Arxiv

0+阅读 · 2023年5月19日

Large Language Models can be Guided to Evade AI-Generated Text Detection

Arxiv

0+阅读 · 2023年5月19日

VIP会员

文章信息

相关主题

垃圾邮件检测

大型语言模型

相关VIP内容

【ICML2023】调整语言模型作为增强少样本学习的训练数据生成器

【ICML2023】调整语言模型作为增强少样本学习的训练数据生成器

专知会员服务

32+阅读 · 2023年5月19日

百篇论文纵览大型语言模型最新研究进展

百篇论文纵览大型语言模型最新研究进展

专知会员服务

70+阅读 · 2023年3月31日

【ICLR2022】Transformers亦能贝叶斯推断

【ICLR2022】Transformers亦能贝叶斯推断

专知会员服务

25+阅读 · 2021年12月23日

预训练语言模型fine-tuning近期进展概述

预训练语言模型fine-tuning近期进展概述

专知会员服务

40+阅读 · 2021年4月9日

【GPT-3作者亲解】超大型语言模型少样本学习，109页ppt

【GPT-3作者亲解】超大型语言模型少样本学习，109页ppt

专知会员服务

109+阅读 · 2020年12月19日

【ICML2020】文本摘要生成模型PEGASUS

【ICML2020】文本摘要生成模型PEGASUS

专知会员服务

35+阅读 · 2020年8月23日

【微软】大型神经语言模型的对抗性训练，Adversarial Training for Large Neural Language Models

【微软】大型神经语言模型的对抗性训练，Adversarial Training for Large Neural Language Models

专知会员服务

51+阅读 · 2020年5月3日

【Amazon】使用预先训练的Transformer模型进行数据增强，Data Augmentation using Pre-trained Transformer Models

【Amazon】使用预先训练的Transformer模型进行数据增强，Data Augmentation using Pre-trained Transformer Models

专知会员服务

51+阅读 · 2020年3月7日

【微软雷德蒙研究院】小样本自然语言生成，Few-shot Natural Language Generation for Task-Oriented Dialog

【微软雷德蒙研究院】小样本自然语言生成，Few-shot Natural Language Generation for Task-Oriented Dialog

专知会员服务

33+阅读 · 2020年2月29日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

《物联网（IoT）中的无人机通信高效控制》135页

《在GNSS信号降级环境中利用共识实现无人机集群稳健协调》

中程单向攻击无人机的战略意义：俄乌战争启示

《面向无人机集群的避障动态传感器覆盖算法》最新38页

相关资讯

首次：微软用GPT-4做大模型指令微调，新任务零样本性能再提升

首次：微软用GPT-4做大模型指令微调，新任务零样本性能再提升

机器之心

7+阅读 · 2023年4月9日

ICLR'23上的高分GNN论文. 来看看你的投稿得分是啥水平吧

ICLR'23上的高分GNN论文. 来看看你的投稿得分是啥水平吧

图与推荐

4+阅读 · 2022年11月9日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

RoBERTa中文预训练模型：RoBERTa for Chinese

RoBERTa中文预训练模型：RoBERTa for Chinese

PaperWeekly

57+阅读 · 2019年9月16日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

年度必读：2018最具突破性人工智能论文Top 10

年度必读：2018最具突破性人工智能论文Top 10

机器学习算法与Python学习

11+阅读 · 2018年12月2日

谷歌发表的史上最强NLP模型BERT的官方代码和预训练模型可以下载了

谷歌发表的史上最强NLP模型BERT的官方代码和预训练模型可以下载了

AINLP

12+阅读 · 2018年11月1日

【论文推荐】最新十篇目标跟踪相关论文—多帧光流跟踪、动态图学习、MV-YOLO、姿态估计、深度核相关滤波、Benchmark

【论文推荐】最新十篇目标跟踪相关论文—多帧光流跟踪、动态图学习、MV-YOLO、姿态估计、深度核相关滤波、Benchmark

专知

13+阅读 · 2018年5月26日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【推荐】深度学习目标检测全面综述

【推荐】深度学习目标检测全面综述

机器学习研究会

21+阅读 · 2017年9月13日

相关论文

WYWEB: A NLP Evaluation Benchmark For Classical Chinese

Arxiv

0+阅读 · 2023年5月23日

Evaluating ChatGPT's Performance for Multilingual and Emoji-based Hate Speech Detection

Arxiv

0+阅读 · 2023年5月23日

Language-Agnostic Bias Detection in Language Models

Arxiv

0+阅读 · 2023年5月22日

Rethinking the Evaluation for Conversational Recommendation in the Era of Large Language Models

Arxiv

1+阅读 · 2023年5月22日

Are Large Language Models Good Evaluators for Abstractive Summarization?

Arxiv

0+阅读 · 2023年5月22日

DUMB: A Benchmark for Smart Evaluation of Dutch Models

Arxiv

0+阅读 · 2023年5月22日

CodeT5+: Open Code Large Language Models for Code Understanding and Generation

CodeT5+: Open Code Large Language Models for Code Understanding and Generation

Arxiv

0+阅读 · 2023年5月20日

Controlling the Extraction of Memorized Data from Large Language Models via Prompt-Tuning

Arxiv

0+阅读 · 2023年5月19日

HELMA: A Large-Scale Hallucination Evaluation Benchmark for Large Language Models

Arxiv

0+阅读 · 2023年5月19日

Large Language Models can be Guided to Evade AI-Generated Text Detection

Arxiv

0+阅读 · 2023年5月19日

相关基金

ARB抑制miR-193a表达促进早期糖尿病肾病壁层上皮细胞-足细胞转分化研究

国家自然科学基金

0+阅读 · 2015年12月31日

肺炎链球菌疫苗SPY1的一种免疫保护机制：TGF-β信号通路介导Treg细胞参与保护性免疫

国家自然科学基金

0+阅读 · 2015年12月31日

近地空间环境下含Sc铝合金的高速撞击特性研究及可靠性评估

国家自然科学基金

0+阅读 · 2014年12月31日

面向大数据的渐进式集成学习方法与分布式算法研究

国家自然科学基金

2+阅读 · 2014年12月31日

机器翻译中大规模异类特征的迁移学习

国家自然科学基金

2+阅读 · 2013年12月31日

大规模高分辨质谱数据挖掘新方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

一维核（Si、Ge）/壳（碳）结构多孔纳米线、纳米管的可控制备以及高性能储锂研究

国家自然科学基金

0+阅读 · 2012年12月31日

医学图像的高容量及鲁棒可逆水印的研究

国家自然科学基金

1+阅读 · 2012年12月31日

髓系抑制性细胞（MDSC）参与鼻咽癌免疫耐受的作用和调控机制

国家自然科学基金

0+阅读 · 2012年12月31日

非参数CFAR检测理论及应用

国家自然科学基金

0+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员