Tools and Benchmarks for Automated Log Parsing

Jieming Zhu; Jinyang Liu; Michael R. Lyu; Pinjia He; Qi Xie; Shilin He; Zibin Zheng

arxiv: 1811.03509 · v2 · pith:MUODOP2Wnew · submitted 2018-11-08 · 💻 cs.SE

Tools and Benchmarks for Automated Log Parsing

Jieming Zhu , Shilin He , Jinyang Liu , Pinjia He , Qi Xie , Zibin Zheng , Michael R. Lyu This is my paper

classification 💻 cs.SE

keywords systemsautomatedparsinglogsmanyparserssoftwaretools

0 comments

read the original abstract

Logs are imperative in the development and maintenance process of many software systems. They record detailed runtime information that allows developers and support engineers to monitor their systems and dissect anomalous behaviors and errors. The increasing scale and complexity of modern software systems, however, make the volume of logs explodes. In many cases, the traditional way of manual log inspection becomes impractical. Many recent studies, as well as industrial tools, resort to powerful text search and machine learning-based analytics solutions. Due to the unstructured nature of logs, a first crucial step is to parse log messages into structured data for subsequent analysis. In recent years, automated log parsing has been widely studied in both academia and industry, producing a series of log parsers by different techniques. To better understand the characteristics of these log parsers, in this paper, we present a comprehensive evaluation study on automated log parsing and further release the tools and benchmarks for easy reuse. More specifically, we evaluate 13 log parsers on a total of 16 log datasets spanning distributed systems, supercomputers, operating systems, mobile systems, server applications, and standalone software. We report the benchmarking results in terms of accuracy, robustness, and efficiency, which are of practical importance when deploying automated log parsing in production. We also share the success stories and lessons learned in an industrial application at Huawei. We believe that our work could serve as the basis and provide valuable guidance to future research and deployment of automated log parsing.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

KRONE: Scalable LLM-Augmented Log Anomaly Detection via Hierarchical Abstraction
cs.DB 2026-02 conditional novelty 7.0

KRONE derives semantic execution hierarchies from flat logs to enable modular multi-level anomaly detection with hybrid local and nested-aware detectors plus limited LLM use, delivering 10% F1 gains and over 100x data...
A Pvalue-guided Anomaly Detection Approach Combining Multiple Heterogeneous Log Parser Algorithms on IIoT Systems
cs.CR 2019-07 unverdicted novelty 3.0

P-value guided combination of heterogeneous log parsers detects anomalies in IIoT logs, tested on HDFS and real IIoT data with blockchain for integrity.