大型在线服务系统基于图表的事件汇总 (Graph-based Incident Aggregation for Large-Scale Online Service Systems) - 专知论文

会员服务 ·

0

级联 · 相关系数 · 学成 · Continuity · 在线 ·

2021 年 8 月 27 日

Graph-based Incident Aggregation for Large-Scale Online Service Systems

翻译：大型在线服务系统基于图表的事件汇总

Zhuangbin Chen,Jinyang Liu,Yuxin Su,Hongyu Zhang,Xuemin Wen,Xiao Ling,Yongqiang Yang,Michael R. Lyu

from arxiv, Accepted by 2021 36th IEEE/ACM International Conference on Automated Software Engineering (ASE'21)

As online service systems continue to grow in terms of complexity and volume, how service incidents are managed will significantly impact company revenue and user trust. Due to the cascading effect, cloud failures often come with an overwhelming number of incidents from dependent services and devices. To pursue efficient incident management, related incidents should be quickly aggregated to narrow down the problem scope. To this end, in this paper, we propose GRLIA, an incident aggregation framework based on graph representation learning over the cascading graph of cloud failures. A representation vector is learned for each unique type of incident in an unsupervised and unified manner, which is able to simultaneously encode the topological and temporal correlations among incidents. Thus, it can be easily employed for online incident aggregation. In particular, to learn the correlations more accurately, we try to recover the complete scope of failures' cascading impact by leveraging fine-grained system monitoring data, i.e., Key Performance Indicators (KPIs). The proposed framework is evaluated with real-world incident data collected from a large-scale online service system of Huawei Cloud. The experimental results demonstrate that GRLIA is effective and outperforms existing methods. Furthermore, our framework has been successfully deployed in industrial practice.

翻译：随着在线服务系统在复杂程度和数量方面继续增长,如何管理服务事故将极大地影响公司收入和用户信任。由于分层效应,云层失灵往往带来依赖性服务和装置造成的大量事件。为了追求高效事件管理,应迅速将相关事件汇总,缩小问题范围。为此,我们在本文件中提议GRIA,一个基于在云层失灵层层图中进行图表表达学习的事故汇总框架。以不受监督和统一的方式,为每个独特的类型事件学习一个代表矢量,能够同时对事件之间的表面和时间相关性进行编码。因此,它很容易用于在线事件汇总。特别是,为了更准确地了解相关关系,我们试图通过利用微缩的系统监测数据,即关键性能指标(KPIs),来恢复失灵层影响的完整范围。拟议框架以一个大型的Huawewweu Cloud在线服务系统收集到的真实世界事件数据进行评估。实验结果显示,GRIA是有效的,超越了我们现有的工业框架。

0

相关内容

【KDD 2020】M2GRL: 一个多任务多视角图表示学习框架的Web-scale的推荐系统，M2GRL: A Multi-task Multi-view Graph Representation Learning Framework for Web-scale Recommender Systems

【KDD 2020】M2GRL: 一个多任务多视角图表示学习框架的Web-scale的推荐系统，M2GRL: A Multi-task Multi-view Graph Representation Learning Framework for Web-scale Recommender Systems

专知会员服务

29+阅读 · 2020年6月30日

【KDD2020】CAST:一种基于相关关系的多尺度数据自适应光谱聚类算法,CAST: A Correlation-based Adaptive Spectral Clustering Algorithm on Multi-scale Data

【KDD2020】CAST:一种基于相关关系的多尺度数据自适应光谱聚类算法,CAST: A Correlation-based Adaptive Spectral Clustering Algorithm on Multi-scale Data

专知会员服务

20+阅读 · 2020年6月11日

面向大数据存储的大型元数据服务器的研究，A Survey on Large Scale Metadata Server for Big Data Storage

面向大数据存储的大型元数据服务器的研究，A Survey on Large Scale Metadata Server for Big Data Storage

专知会员服务

9+阅读 · 2020年5月15日

【ACL2019】基于学习注意力机制的知识图谱中关系预测的嵌入 Learning Attention-based Embeddings for Relation Prediction in Knowledge Graphs

【ACL2019】基于学习注意力机制的知识图谱中关系预测的嵌入 Learning Attention-based Embeddings for Relation Prediction in Knowledge Graphs

专知会员服务

122+阅读 · 2020年3月29日

【论文推荐】张量图卷积网络的多关系和鲁棒学习，Tensor Graph Convolutional Networks for Multi-relational and Robust Learning

【论文推荐】张量图卷积网络的多关系和鲁棒学习，Tensor Graph Convolutional Networks for Multi-relational and Robust Learning

专知会员服务

26+阅读 · 2020年3月19日

【WWW2020-MAGNN】异质图嵌入的集合图神经网络 MAGNN: Metapath Aggregated Graph Neural Network for Heterogeneous Graph Embedding

【WWW2020-MAGNN】异质图嵌入的集合图神经网络 MAGNN: Metapath Aggregated Graph Neural Network for Heterogeneous Graph Embedding

专知会员服务

116+阅读 · 2020年2月10日

【中科大】上下文感知推荐系统的图卷积机：Graph Convolution Machine for Context-aware Recommender System

【中科大】上下文感知推荐系统的图卷积机：Graph Convolution Machine for Context-aware Recommender System

专知会员服务

71+阅读 · 2020年2月5日

【大规模数据系统，552页ppt】Large-scale Data Systems

【大规模数据系统，552页ppt】Large-scale Data Systems

专知会员服务

61+阅读 · 2019年12月21日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

244+阅读 · 2019年10月21日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

LibRec 精选：AutoML for Contextual Bandits

LibRec 精选：AutoML for Contextual Bandits

LibRec智能推荐

7+阅读 · 2019年9月19日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Call for Participation: Shared Tasks in NLPCC 2019

Call for Participation: Shared Tasks in NLPCC 2019

中国计算机学会

5+阅读 · 2019年3月22日

LibRec 精选：推荐系统的常用数据集

LibRec 精选：推荐系统的常用数据集

LibRec智能推荐

17+阅读 · 2019年2月15日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

LibRec 精选：连通知识图谱与推荐系统

LibRec 精选：连通知识图谱与推荐系统

LibRec智能推荐

3+阅读 · 2018年8月9日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

Evolutionary Equilibrium Analysis for Decision on Block Size in Blockchain Systems

Arxiv

0+阅读 · 2021年10月19日

Online Graph Learning in Dynamic Environments

Arxiv

0+阅读 · 2021年10月11日

Recommender systems based on graph embedding techniques: A comprehensive review

Arxiv

23+阅读 · 2021年9月20日

Correlation-Based Device Energy-Efficient Dynamic Multi-Task Offloading for Mobile Edge Computing

Arxiv

0+阅读 · 2021年8月21日

AID: Efficient Prediction of Aggregated Intensity of Dependency in Large-scale Cloud Systems

Arxiv

0+阅读 · 2021年8月20日

FedGNN: Federated Graph Neural Network for Privacy-Preserving Recommendation

Arxiv

5+阅读 · 2021年2月9日

Cold-start Sequential Recommendation via Meta Learner

Cold-start Sequential Recommendation via Meta Learner

Arxiv

15+阅读 · 2020年12月10日

M2GRL: A Multi-task Multi-view Graph Representation Learning Framework for Web-scale Recommender Systems

Arxiv

8+阅读 · 2020年6月1日

Graph Enhanced Representation Learning for News Recommendation

Arxiv

24+阅读 · 2020年3月31日

Graph Neural Networks for Social Recommendation

Arxiv

20+阅读 · 2019年11月23日

VIP会员

文章信息

相关主题

相关VIP内容

【KDD 2020】M2GRL: 一个多任务多视角图表示学习框架的Web-scale的推荐系统，M2GRL: A Multi-task Multi-view Graph Representation Learning Framework for Web-scale Recommender Systems

【KDD 2020】M2GRL: 一个多任务多视角图表示学习框架的Web-scale的推荐系统，M2GRL: A Multi-task Multi-view Graph Representation Learning Framework for Web-scale Recommender Systems

专知会员服务

29+阅读 · 2020年6月30日

【KDD2020】CAST:一种基于相关关系的多尺度数据自适应光谱聚类算法,CAST: A Correlation-based Adaptive Spectral Clustering Algorithm on Multi-scale Data

【KDD2020】CAST:一种基于相关关系的多尺度数据自适应光谱聚类算法,CAST: A Correlation-based Adaptive Spectral Clustering Algorithm on Multi-scale Data

专知会员服务

20+阅读 · 2020年6月11日

面向大数据存储的大型元数据服务器的研究，A Survey on Large Scale Metadata Server for Big Data Storage

面向大数据存储的大型元数据服务器的研究，A Survey on Large Scale Metadata Server for Big Data Storage

专知会员服务

9+阅读 · 2020年5月15日

【ACL2019】基于学习注意力机制的知识图谱中关系预测的嵌入 Learning Attention-based Embeddings for Relation Prediction in Knowledge Graphs

【ACL2019】基于学习注意力机制的知识图谱中关系预测的嵌入 Learning Attention-based Embeddings for Relation Prediction in Knowledge Graphs

专知会员服务

122+阅读 · 2020年3月29日

【论文推荐】张量图卷积网络的多关系和鲁棒学习，Tensor Graph Convolutional Networks for Multi-relational and Robust Learning

【论文推荐】张量图卷积网络的多关系和鲁棒学习，Tensor Graph Convolutional Networks for Multi-relational and Robust Learning

专知会员服务

26+阅读 · 2020年3月19日

【WWW2020-MAGNN】异质图嵌入的集合图神经网络 MAGNN: Metapath Aggregated Graph Neural Network for Heterogeneous Graph Embedding

【WWW2020-MAGNN】异质图嵌入的集合图神经网络 MAGNN: Metapath Aggregated Graph Neural Network for Heterogeneous Graph Embedding

专知会员服务

116+阅读 · 2020年2月10日

【中科大】上下文感知推荐系统的图卷积机：Graph Convolution Machine for Context-aware Recommender System

【中科大】上下文感知推荐系统的图卷积机：Graph Convolution Machine for Context-aware Recommender System

专知会员服务

71+阅读 · 2020年2月5日

【大规模数据系统，552页ppt】Large-scale Data Systems

【大规模数据系统，552页ppt】Large-scale Data Systems

专知会员服务

61+阅读 · 2019年12月21日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

244+阅读 · 2019年10月21日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

热门VIP内容

开通专知VIP会员享更多权益服务

【CMU博士论文】用于提升含优化层学习的算法与体系结构

【NeurIPS2025】有何不同于过去？基于自监督偏差学习的时空时间序列预测

超越决策优势：情报在创新与适应中的作用

量子计算发展态势研究报告（2025年）

相关资讯

LibRec 精选：AutoML for Contextual Bandits

LibRec 精选：AutoML for Contextual Bandits

LibRec智能推荐

7+阅读 · 2019年9月19日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Call for Participation: Shared Tasks in NLPCC 2019

Call for Participation: Shared Tasks in NLPCC 2019

中国计算机学会

5+阅读 · 2019年3月22日

LibRec 精选：推荐系统的常用数据集

LibRec 精选：推荐系统的常用数据集

LibRec智能推荐

17+阅读 · 2019年2月15日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

LibRec 精选：连通知识图谱与推荐系统

LibRec 精选：连通知识图谱与推荐系统

LibRec智能推荐

3+阅读 · 2018年8月9日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

相关论文

Evolutionary Equilibrium Analysis for Decision on Block Size in Blockchain Systems

Arxiv

0+阅读 · 2021年10月19日

Online Graph Learning in Dynamic Environments

Arxiv

0+阅读 · 2021年10月11日

Recommender systems based on graph embedding techniques: A comprehensive review

Arxiv

23+阅读 · 2021年9月20日

Correlation-Based Device Energy-Efficient Dynamic Multi-Task Offloading for Mobile Edge Computing

Arxiv

0+阅读 · 2021年8月21日

AID: Efficient Prediction of Aggregated Intensity of Dependency in Large-scale Cloud Systems

Arxiv

0+阅读 · 2021年8月20日

FedGNN: Federated Graph Neural Network for Privacy-Preserving Recommendation

Arxiv

5+阅读 · 2021年2月9日

Cold-start Sequential Recommendation via Meta Learner

Cold-start Sequential Recommendation via Meta Learner

Arxiv

15+阅读 · 2020年12月10日

M2GRL: A Multi-task Multi-view Graph Representation Learning Framework for Web-scale Recommender Systems

Arxiv

8+阅读 · 2020年6月1日

Graph Enhanced Representation Learning for News Recommendation

Arxiv

24+阅读 · 2020年3月31日

Graph Neural Networks for Social Recommendation

Arxiv

20+阅读 · 2019年11月23日

微信扫码咨询专知VIP会员