LAnoBERT : 基于 BERT 掩码语言模型的系统日志异常检测 (LAnoBERT : System Log Anomaly Detection based on BERT Masked Language Model) - 专知论文

会员服务 ·

0

掩码语言模型化 · 异常检测 · 语言模型化 · MoDELS · 掩码 ·

2021 年 11 月 20 日

LAnoBERT : System Log Anomaly Detection based on BERT Masked Language Model

翻译：LAnoBERT : 基于 BERT 掩码语言模型的系统日志异常检测

Yukyung Lee,Jina Kim,Pilsung Kang

The system log generated in a computer system refers to large-scale data that are collected simultaneously and used as the basic data for determining simple errors and detecting external adversarial intrusion or the abnormal behaviors of insiders. The aim of system log anomaly detection is to promptly identify anomalies while minimizing human intervention, which is a critical problem in the industry. Previous studies performed anomaly detection through algorithms after converting various forms of log data into a standardized template using a parser. These methods involved generating a template for refining the log key. Particularly, a template corresponding to a specific event should be defined in advance for all the log data using which the information within the log key may get lost.In this study, we propose LAnoBERT, a parser free system log anomaly detection method that uses the BERT model, exhibiting excellent natural language processing performance. The proposed method, LAnoBERT, learns the model through masked language modeling, which is a BERT-based pre-training method, and proceeds with unsupervised learning-based anomaly detection using the masked language modeling loss function per log key word during the inference process. LAnoBERT achieved better performance compared to previous methodology in an experiment conducted using benchmark log datasets, HDFS, and BGL, and also compared to certain supervised learning-based models.

翻译：计算机系统中生成的系统日志是指同时收集并用作确定简单错误和发现外部对抗入侵或内幕者异常行为的基本数据的大比例数据。系统日志异常现象探测的目的是迅速识别异常现象,同时尽量减少人为干预,这是该行业的一个关键问题。以前的研究在将不同形式的日志数据转换成使用剖析器的标准化模板后,通过算法发现了异常现象。这些方法包括制作一个用于改进日志键的模板。特别是,应事先为记录键中的信息可能丢失的所有日志数据定义一个与具体事件对应的模板。在本研究中,我们提议使用一种无源系统日志异常现象探测方法,即使用BERT模型,展示良好的自然语言处理性能。拟议的方法LAnoBERT通过隐蔽语言模型学习模型,这是一种基于BERT的预培训方法,并且通过使用隐蔽语言模型对每个对日志关键字进行损失的模拟功能,我们建议LAnoBERTRETERT, 将业绩与先前的学习模型进行比较。LDF。在使用前一种数据测试中,LS进行了更好的测试。

2

相关内容

掩码语言模型化

掩码语言模型化

最新《Transformers模型》教程，64页ppt

最新《Transformers模型》教程，64页ppt

专知会员服务

325+阅读 · 2020年11月26日

【KDD2020-UCLA-微软】GPT-GNN：图神经网络的预训练

【KDD2020-UCLA-微软】GPT-GNN：图神经网络的预训练

专知会员服务

63+阅读 · 2020年8月19日

【KDD2020】动态图的拉普拉斯变换点检测，Laplacian Change Point Detection for Dynamic Graphs

【KDD2020】动态图的拉普拉斯变换点检测，Laplacian Change Point Detection for Dynamic Graphs

专知会员服务

38+阅读 · 2020年7月3日

【斯坦福大学AI】BERT, ELMo， & GPT-2:上下文化的单词表示是怎样的?

专知会员服务

35+阅读 · 2020年3月28日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【微软研究院】IMAGEBERT: CROSS-MODAL PRE-TRAINING WITH LARGE-SCALE WEAK-SUPERVISED IMAGE-TEXT DATA

【微软研究院】IMAGEBERT: CROSS-MODAL PRE-TRAINING WITH LARGE-SCALE WEAK-SUPERVISED IMAGE-TEXT DATA

专知会员服务

43+阅读 · 2020年1月28日

预训练语言模型究竟捕获了什么？（oLMpics - On what Language Model Pre-training Captures）

预训练语言模型究竟捕获了什么？（oLMpics - On what Language Model Pre-training Captures）

专知会员服务

14+阅读 · 2020年1月3日

BERT进展2019四篇必读论文

BERT进展2019四篇必读论文

专知会员服务

70+阅读 · 2020年1月2日

【CCL 2019】ATT-第19期：预训练模型--自然语言处理的新范式（车万翔）

【CCL 2019】ATT-第19期：预训练模型--自然语言处理的新范式（车万翔）

专知会员服务

41+阅读 · 2019年11月12日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

XLNet详解

AINLP

3+阅读 · 2020年4月1日

绝对干货！NLP预训练模型：从transformer到albert

绝对干货！NLP预训练模型：从transformer到albert

新智元

13+阅读 · 2019年11月10日

ELECTRA：超越BERT，19年最佳NLP预训练模型

ELECTRA：超越BERT，19年最佳NLP预训练模型

新智元

6+阅读 · 2019年11月6日

RoBERTa中文预训练模型：RoBERTa for Chinese

RoBERTa中文预训练模型：RoBERTa for Chinese

PaperWeekly

57+阅读 · 2019年9月16日

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

AINLP

30+阅读 · 2019年9月8日

RoBERTa中文预训练模型，你离中文任务的「SOTA」只差个它

RoBERTa中文预训练模型，你离中文任务的「SOTA」只差个它

机器之心

40+阅读 · 2019年9月5日

20项任务全面碾压BERT，全新XLNet预训练模型

20项任务全面碾压BERT，全新XLNet预训练模型

机器学习算法与Python学习

15+阅读 · 2019年6月20日

BERT/Transformer/迁移学习NLP资源大列表

BERT/Transformer/迁移学习NLP资源大列表

专知

19+阅读 · 2019年6月9日

干货 | 谷歌2019最火NLP模型Bert应用详解

干货 | 谷歌2019最火NLP模型Bert应用详解

全球人工智能

7+阅读 · 2019年4月3日

3分钟看懂史上最强NLP模型BERT

3分钟看懂史上最强NLP模型BERT

机器学习算法与Python学习

8+阅读 · 2019年2月27日

BERTHA: Video Captioning Evaluation Via Transfer-Learned Human Assessment

Arxiv

1+阅读 · 2022年1月25日

Reliable Detection of Doppelgängers based on Deep Face Representations

Reliable Detection of Doppelgängers based on Deep Face Representations

Arxiv

0+阅读 · 2022年1月21日

UP-DETR: Unsupervised Pre-training for Object Detection with Transformers

UP-DETR: Unsupervised Pre-training for Object Detection with Transformers

Arxiv

19+阅读 · 2020年11月18日

Patch SVDD: Patch-level SVDD for Anomaly Detection and Segmentation

Patch SVDD: Patch-level SVDD for Anomaly Detection and Segmentation

Arxiv

4+阅读 · 2020年7月13日

UniLMv2: Pseudo-Masked Language Models for Unified Language Model Pre-Training

Arxiv

15+阅读 · 2020年2月28日

Multi-Stage Document Ranking with BERT

Arxiv

5+阅读 · 2019年10月31日

ALBERT: A Lite BERT for Self-supervised Learning of Language Representations

Arxiv

5+阅读 · 2019年9月26日

Passage Re-ranking with BERT

Arxiv

4+阅读 · 2019年2月18日

Conditional BERT Contextual Augmentation

Conditional BERT Contextual Augmentation

Arxiv

8+阅读 · 2018年12月17日

Few-Example Object Detection with Model Communication

Arxiv

7+阅读 · 2018年2月14日

VIP会员

文章信息

相关主题

掩码语言模型化

语言模型化

相关VIP内容

最新《Transformers模型》教程，64页ppt

最新《Transformers模型》教程，64页ppt

专知会员服务

325+阅读 · 2020年11月26日

【KDD2020-UCLA-微软】GPT-GNN：图神经网络的预训练

【KDD2020-UCLA-微软】GPT-GNN：图神经网络的预训练

专知会员服务

63+阅读 · 2020年8月19日

【KDD2020】动态图的拉普拉斯变换点检测，Laplacian Change Point Detection for Dynamic Graphs

【KDD2020】动态图的拉普拉斯变换点检测，Laplacian Change Point Detection for Dynamic Graphs

专知会员服务

38+阅读 · 2020年7月3日

【斯坦福大学AI】BERT, ELMo， & GPT-2:上下文化的单词表示是怎样的?

专知会员服务

35+阅读 · 2020年3月28日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【微软研究院】IMAGEBERT: CROSS-MODAL PRE-TRAINING WITH LARGE-SCALE WEAK-SUPERVISED IMAGE-TEXT DATA

【微软研究院】IMAGEBERT: CROSS-MODAL PRE-TRAINING WITH LARGE-SCALE WEAK-SUPERVISED IMAGE-TEXT DATA

专知会员服务

43+阅读 · 2020年1月28日

预训练语言模型究竟捕获了什么？（oLMpics - On what Language Model Pre-training Captures）

预训练语言模型究竟捕获了什么？（oLMpics - On what Language Model Pre-training Captures）

专知会员服务

14+阅读 · 2020年1月3日

BERT进展2019四篇必读论文

BERT进展2019四篇必读论文

专知会员服务

70+阅读 · 2020年1月2日

【CCL 2019】ATT-第19期：预训练模型--自然语言处理的新范式（车万翔）

【CCL 2019】ATT-第19期：预训练模型--自然语言处理的新范式（车万翔）

专知会员服务

41+阅读 · 2019年11月12日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

热门VIP内容

开通专知VIP会员享更多权益服务

《城市滨海地区：理解复杂多变环境下的指挥控制框架》50页报告

《理解城市战及其在俄乌战争中的表现》报告

美空军“顶点2025”实验：推进AI在C2、动态目标锁定与联盟集成中的应用

《建设式兵棋模拟作为战术集群配置优化的关键组成部分》

相关资讯

XLNet详解

AINLP

3+阅读 · 2020年4月1日

绝对干货！NLP预训练模型：从transformer到albert

绝对干货！NLP预训练模型：从transformer到albert

新智元

13+阅读 · 2019年11月10日

ELECTRA：超越BERT，19年最佳NLP预训练模型

ELECTRA：超越BERT，19年最佳NLP预训练模型

新智元

6+阅读 · 2019年11月6日

RoBERTa中文预训练模型：RoBERTa for Chinese

RoBERTa中文预训练模型：RoBERTa for Chinese

PaperWeekly

57+阅读 · 2019年9月16日

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

AINLP

30+阅读 · 2019年9月8日

RoBERTa中文预训练模型，你离中文任务的「SOTA」只差个它

RoBERTa中文预训练模型，你离中文任务的「SOTA」只差个它

机器之心

40+阅读 · 2019年9月5日

20项任务全面碾压BERT，全新XLNet预训练模型

20项任务全面碾压BERT，全新XLNet预训练模型

机器学习算法与Python学习

15+阅读 · 2019年6月20日

BERT/Transformer/迁移学习NLP资源大列表

BERT/Transformer/迁移学习NLP资源大列表

专知

19+阅读 · 2019年6月9日

干货 | 谷歌2019最火NLP模型Bert应用详解

干货 | 谷歌2019最火NLP模型Bert应用详解

全球人工智能

7+阅读 · 2019年4月3日

3分钟看懂史上最强NLP模型BERT

3分钟看懂史上最强NLP模型BERT

机器学习算法与Python学习

8+阅读 · 2019年2月27日

相关论文

BERTHA: Video Captioning Evaluation Via Transfer-Learned Human Assessment

Arxiv

1+阅读 · 2022年1月25日

Reliable Detection of Doppelgängers based on Deep Face Representations

Reliable Detection of Doppelgängers based on Deep Face Representations

Arxiv

0+阅读 · 2022年1月21日

UP-DETR: Unsupervised Pre-training for Object Detection with Transformers

UP-DETR: Unsupervised Pre-training for Object Detection with Transformers

Arxiv

19+阅读 · 2020年11月18日

Patch SVDD: Patch-level SVDD for Anomaly Detection and Segmentation

Patch SVDD: Patch-level SVDD for Anomaly Detection and Segmentation

Arxiv

4+阅读 · 2020年7月13日

UniLMv2: Pseudo-Masked Language Models for Unified Language Model Pre-Training

Arxiv

15+阅读 · 2020年2月28日

Multi-Stage Document Ranking with BERT

Arxiv

5+阅读 · 2019年10月31日

ALBERT: A Lite BERT for Self-supervised Learning of Language Representations

Arxiv

5+阅读 · 2019年9月26日

Passage Re-ranking with BERT

Arxiv

4+阅读 · 2019年2月18日

Conditional BERT Contextual Augmentation

Conditional BERT Contextual Augmentation

Arxiv

8+阅读 · 2018年12月17日

Few-Example Object Detection with Model Communication

Arxiv

7+阅读 · 2018年2月14日

微信扫码咨询专知VIP会员