The multi-source data generated by distributed systems, provide a holistic description of the system. Harnessing the joint distribution of the different modalities by a learning model can be beneficial for critical applications for maintenance of the distributed systems. One such important task is the task of anomaly detection where we are interested in detecting the deviation of the current behaviour of the system from the theoretically expected. In this work, we utilize the joint representation from the distributed traces and system log data for the task of anomaly detection in distributed systems. We demonstrate that the joint utilization of traces and logs produced better results compared to the single modality anomaly detection methods. Furthermore, we formalize a learning task - next template prediction NTP, that is used as a generalization for anomaly detection for both logs and distributed trace. Finally, we demonstrate that this formalization allows for the learning of template embedding for both the traces and logs. The joint embeddings can be reused in other applications as good initialization for spans and logs.
翻译:利用分布式系统产生的多源数据,对系统进行整体描述; 利用学习模式对不同模式进行联合分配,对于维护分布式系统至关重要; 一项如此重要的任务就是探测异常点,我们有兴趣发现系统目前行为偏离理论上预期的情况; 在这项工作中,我们利用分布式系统中分布式跟踪和系统日志数据的联合代表,在分布式系统中探测异常点的任务; 我们证明,与单一模式异常点检测方法相比,联合利用痕点和日志产生更好的结果; 此外,我们正式确定了学习任务-下一个模板预测NTP,用于对日志和分布式跟踪进行异常点检测; 最后,我们证明,这种正规化有助于学习嵌入痕点和日志的模板; 联合嵌入可再用于其他应用程序,作为跨线和日志的良好初始化。