Modern software development and operations rely on monitoring to understand how systems behave in production. The data provided by application logs and runtime environment are essential to detect and diagnose undesired behavior and improve system reliability. However, despite the rich ecosystem around industry-ready log solutions, monitoring complex systems and getting insights from log data remains a challenge. Researchers and practitioners have been actively working to address several challenges related to logs, e.g., how to effectively provide better tooling support for logging decisions to developers, how to effectively process and store log data, and how to extract insights from log data. A holistic view of the logging research field is key to provide directions and to disseminate the state-of-the-art for technology transferring. In this paper, we study 108 papers (72 research track papers, 24 journals, and 12 industry track papers) from different communities (e.g., machine learning, software engineering, and systems) and structure the research field in light to the life-cycle of log data. Our analysis shows that (1) logging is challenge not only in open source projects but also in industry, (2) machine learning is a promising approach to enable contextual analysis of source code for log recommendation but further investigation is required to assess the usability of those tools in practice, (3) few studies approached efficient persistence of log data, and (4) there are open opportunities to analyze application logs and to evaluate state-of-the-art log analysis techniques in a DevOps context.
翻译:应用日志和运行时间环境提供的数据对于检测和诊断不理想的行为和提高系统可靠性至关重要。然而,尽管行业成熟的日志解决方案周围生态系统丰富,但监测复杂的系统和从日志数据中获得洞察力仍是一个挑战。研究人员和从业人员一直在积极努力应对与日志有关的若干挑战,例如,如何根据日志数据生命周期有效地为开发者伐木决定提供更好的工具支持,如何有效地处理和储存日志数据,以及如何从日志数据中提取见解。对伐木研究领域的整体观点是提供指导和传播技术转让最新技术的关键。在本论文中,我们研究了来自不同社区的108份文件(72份研究轨道文件、24份期刊和12份行业轨道文件),(例如,机器学习、软件工程和系统),并根据日志数据的生命周期来构建研究领域。我们的分析表明:(1) 伐木不仅在开放源项目中,而且在工业中也是挑战。(2) 机学习是一种很有希望的方法,有助于对源码进行背景分析,用于对记录系统进行在线分析,但需要进一步评估这些系统工具的应用。