LiveSecBench：面向中文语言模型应用的动态事件驱动安全基准 (LiveSecBench: A Dynamic and Event-Driven Safety Benchmark for Chinese Language Model Applications) - 专知论文

会员服务 ·

0

语言模型 · 基准 · 动态事件 · 事件 · 事件驱动 ·

LiveSecBench: A Dynamic and Event-Driven Safety Benchmark for Chinese Language Model Applications

翻译：LiveSecBench：面向中文语言模型应用的动态事件驱动安全基准

Yudong Li,Peiru Yang,Feng Huang,Zhongliang Yang,Kecheng Wang,Haitian Li,Baocheng Chen,Xingyu An,Ziyu Liu,Youdan Yang,Kejiang Chen,Sifang Wan,Xu Wang,Yufei Sun,Liyan Wu,Ruiqi Zhou,Wenya Wen,Xingchi Gu,Tianxin Zhang,Yue Gao,Yongfeng Huang

We introduce LiveSecBench, a continuously updated safety benchmark specifically for Chinese-language LLM application scenarios. LiveSecBench constructs a high-quality and unique dataset through a pipeline that combines automated generation with human verification. By periodically releasing new versions to expand the dataset and update evaluation metrics, LiveSecBench provides a robust and up-to-date standard for AI safety. In this report, we introduce our second release v251215, which evaluates across five dimensions (Public Safety, Fairness & Bias, Privacy, Truthfulness, and Mental Health Safety.) We evaluate 57 representative LLMs using an ELO rating system, offering a leaderboard of the current state of Chinese LLM safety. The result is available at https://livesecbench.intokentech.cn/.

翻译：我们推出LiveSecBench，这是一个专门针对中文语言模型应用场景、持续更新的安全基准。LiveSecBench通过结合自动生成与人工验证的流程，构建了一个高质量且独特的数据集。通过定期发布新版本以扩展数据集并更新评估指标，LiveSecBench为人工智能安全提供了一个稳健且与时俱进的标准。本报告介绍了我们的第二个版本v251215，该版本在五个维度（公共安全、公平性与偏见、隐私、真实性及心理健康安全）进行评估。我们采用ELO评分系统对57个代表性语言模型进行了评估，提供了当前中文语言模型安全状况的排行榜。结果可在 https://livesecbench.intokentech.cn/ 查看。

0

相关内容

语言模型

《利用强化学习工具箱进行5G漏洞分析》洛克希德马丁19页slides

《利用强化学习工具箱进行5G漏洞分析》洛克希德马丁19页slides

专知会员服务

23+阅读 · 2023年4月17日

《联合联邦保障中心开发、安全和运营（DevSecOps）战略》美国国防部15页PPT

《联合联邦保障中心开发、安全和运营（DevSecOps）战略》美国国防部15页PPT

专知会员服务

30+阅读 · 2022年8月30日

【Mila-Google】使用元学习动态调整源代码模型，On-the-Fly Adaptation of Source Code Models using Meta-Learning

【Mila-Google】使用元学习动态调整源代码模型，On-the-Fly Adaptation of Source Code Models using Meta-Learning

专知会员服务

21+阅读 · 2020年3月28日

微软发布DialoGPT预训练语言模型，论文与代码 Large-Scale Generative Pre-training for Conversational Response Generation

微软发布DialoGPT预训练语言模型，论文与代码 Large-Scale Generative Pre-training for Conversational Response Generation

专知会员服务

28+阅读 · 2019年11月8日

【O'Reilly AI Conference 2019】使用GPU和Docker容器进行Horovod和Spark深度学习（Deep learning with Horovod and Spark using GPUs and Docker containers），BlueData的联合创始人兼首席架构师Thomas Phelan

【O'Reilly AI Conference 2019】使用GPU和Docker容器进行Horovod和Spark深度学习（Deep learning with Horovod and Spark using GPUs and Docker containers），BlueData的联合创始人兼首席架构师Thomas Phelan

专知会员服务

21+阅读 · 2019年11月5日

NLP大牛Thomas Wolf等新书《Transformer自然语言处理》，466页pdf及代码

NLP大牛Thomas Wolf等新书《Transformer自然语言处理》，466页pdf及代码

专知

36+阅读 · 2022年2月7日

RL解决'LunarLander-v2' (SOTA)

RL解决'LunarLander-v2' (SOTA)

CreateAMind

62+阅读 · 2019年9月27日

预知未来——Gluon 时间序列工具包（GluonTS）

预知未来——Gluon 时间序列工具包（GluonTS）

ApacheMXNet

24+阅读 · 2019年6月25日

深度学习人脸识别系统DFace

深度学习人脸识别系统DFace

深度学习

17+阅读 · 2018年2月14日

大数据分析研究组开源Easy Machine Learning系统

大数据分析研究组开源Easy Machine Learning系统

中国科学院网络数据重点实验室

17+阅读 · 2017年6月13日

基于自主学习的Ad hoc Agent序贯决策研究

国家自然科学基金

46+阅读 · 2015年12月31日

可证明的网络和数据匿名性及隐私增强身份管理关键技术研究

国家自然科学基金

3+阅读 · 2014年12月31日

海量Web用户生成内容物化关键技术

国家自然科学基金

2+阅读 · 2014年12月31日

基于组合Hodge理论的图像视频质量评价方法

国家自然科学基金

0+阅读 · 2014年12月31日

开放动态环境下在线机器学习理论与方法

国家自然科学基金

11+阅读 · 2013年12月31日

PurifyGen: A Risk-Discrimination and Semantic-Purification Model for Safe Text-to-Image Generation

Arxiv

0+阅读 · 12月29日

SciEvalKit: An Open-source Evaluation Toolkit for Scientific General Intelligence

Arxiv

0+阅读 · 12月26日

AutoAdv: Automated Adversarial Prompting for Multi-Turn Jailbreaking of Large Language Models

Arxiv

0+阅读 · 12月23日

SiamGPT: Quality-First Fine-Tuning for Stable Thai Text Generation

Arxiv

0+阅读 · 12月22日

AI Code in the Wild: Measuring Security Risks and Ecosystem Shifts of AI-Generated Code in Modern Software

Arxiv

0+阅读 · 12月21日

VIP会员

文章信息

相关主题

相关VIP内容

《利用强化学习工具箱进行5G漏洞分析》洛克希德马丁19页slides

《利用强化学习工具箱进行5G漏洞分析》洛克希德马丁19页slides

专知会员服务

23+阅读 · 2023年4月17日

《联合联邦保障中心开发、安全和运营（DevSecOps）战略》美国国防部15页PPT

《联合联邦保障中心开发、安全和运营（DevSecOps）战略》美国国防部15页PPT

专知会员服务

30+阅读 · 2022年8月30日

【Mila-Google】使用元学习动态调整源代码模型，On-the-Fly Adaptation of Source Code Models using Meta-Learning

【Mila-Google】使用元学习动态调整源代码模型，On-the-Fly Adaptation of Source Code Models using Meta-Learning

专知会员服务

21+阅读 · 2020年3月28日

微软发布DialoGPT预训练语言模型，论文与代码 Large-Scale Generative Pre-training for Conversational Response Generation

微软发布DialoGPT预训练语言模型，论文与代码 Large-Scale Generative Pre-training for Conversational Response Generation

专知会员服务

28+阅读 · 2019年11月8日

【O'Reilly AI Conference 2019】使用GPU和Docker容器进行Horovod和Spark深度学习（Deep learning with Horovod and Spark using GPUs and Docker containers），BlueData的联合创始人兼首席架构师Thomas Phelan

【O'Reilly AI Conference 2019】使用GPU和Docker容器进行Horovod和Spark深度学习（Deep learning with Horovod and Spark using GPUs and Docker containers），BlueData的联合创始人兼首席架构师Thomas Phelan

专知会员服务

21+阅读 · 2019年11月5日

热门VIP内容

开通专知VIP会员享更多权益服务

星链与未来战争

《黑蜂（Black Hummingbird）微型无人机》

《全球地缘政治环境中的反无人机系统互操作性》252页

《美国：为自动驾驶汽车铺平道路——未来出行已来》最新43页报告

相关资讯

NLP大牛Thomas Wolf等新书《Transformer自然语言处理》，466页pdf及代码

NLP大牛Thomas Wolf等新书《Transformer自然语言处理》，466页pdf及代码

专知

36+阅读 · 2022年2月7日

RL解决'LunarLander-v2' (SOTA)

RL解决'LunarLander-v2' (SOTA)

CreateAMind

62+阅读 · 2019年9月27日

预知未来——Gluon 时间序列工具包（GluonTS）

预知未来——Gluon 时间序列工具包（GluonTS）

ApacheMXNet

24+阅读 · 2019年6月25日

深度学习人脸识别系统DFace

深度学习人脸识别系统DFace

深度学习

17+阅读 · 2018年2月14日

大数据分析研究组开源Easy Machine Learning系统

大数据分析研究组开源Easy Machine Learning系统

中国科学院网络数据重点实验室

17+阅读 · 2017年6月13日

相关论文

PurifyGen: A Risk-Discrimination and Semantic-Purification Model for Safe Text-to-Image Generation

Arxiv

0+阅读 · 12月29日

SciEvalKit: An Open-source Evaluation Toolkit for Scientific General Intelligence

Arxiv

0+阅读 · 12月26日

AutoAdv: Automated Adversarial Prompting for Multi-Turn Jailbreaking of Large Language Models

Arxiv

0+阅读 · 12月23日

SiamGPT: Quality-First Fine-Tuning for Stable Thai Text Generation

Arxiv

0+阅读 · 12月22日

AI Code in the Wild: Measuring Security Risks and Ecosystem Shifts of AI-Generated Code in Modern Software

Arxiv

0+阅读 · 12月21日

相关基金

基于自主学习的Ad hoc Agent序贯决策研究

国家自然科学基金

46+阅读 · 2015年12月31日

可证明的网络和数据匿名性及隐私增强身份管理关键技术研究

国家自然科学基金

3+阅读 · 2014年12月31日

海量Web用户生成内容物化关键技术

国家自然科学基金

2+阅读 · 2014年12月31日

基于组合Hodge理论的图像视频质量评价方法

国家自然科学基金

0+阅读 · 2014年12月31日

开放动态环境下在线机器学习理论与方法

国家自然科学基金

11+阅读 · 2013年12月31日

微信扫码咨询专知VIP会员