Template mining is one of the foundational tasks to support log analysis, which supports the diagnosis and troubleshooting of large scale Web applications. This paper develops a human-in-the-loop template mining framework to support interactive log analysis, which is highly desirable in real-world diagnosis or troubleshooting of Web applications but yet previous template mining algorithms fails to support it. We formulate three types of light-weight user feedbacks and based on them we design three atomic human-in-the-loop template mining algorithms. We derive mild conditions under which the outputs of our proposed algorithms are provably correct. We also derive upper bounds on the computational complexity and query complexity of each algorithm. We demonstrate the versatility of our proposed algorithms by combining them to improve the template mining accuracy of five representative algorithms over sixteen widely used benchmark datasets.
翻译:模板开采是支持日志分析的基本任务之一,它支持大规模网络应用程序的诊断和排除故障。本文开发了一个人到行模板采矿框架,以支持互动式日志分析,这是现实世界对网络应用程序的诊断或故障排除非常可取的,但先前的模板采矿算法未能支持这种分析。我们开发了三种轻量用户反馈,并在此基础上设计了三种原子在行中模板采矿算法。我们得出了一种温和的条件,在这种条件下,我们提议的算法的输出完全正确。我们还从每种算法的计算复杂性和查询复杂性中获得了上限。我们通过结合这些算法来提高16个广泛使用的基准数据集的5种代表性算法的模板准确性,从而展示了我们拟议算法的多功能性。</s>