Recognition: no theorem link
LLM4Log: A Systematic Review of Large Language Model-based Log Analysis
Pith reviewed 2026-05-15 08:18 UTC · model grok-4.3
The pith
A review of 145 papers maps LLM use across seven log analysis tasks and distills patterns plus adoption challenges.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Following a structured search and manual screening protocol completed in November 2025, the review identifies 145 unique papers on LLM-based log analysis across seven tasks. It synthesizes the field through a unified task-driven taxonomy, summarizes common design patterns including prompting and in-context learning, retrieval grounding, fine-tuning, tool and agent augmentation, and verification, and analyzes evaluation practices, datasets, metrics, and reproducibility. From these cross-paper findings it distills key lessons and open challenges for reliable real-world adoption, with emphasis on robustness under drift and long-tail events, grounding and faithfulness for operator-facing outputs
What carries the argument
A unified task-driven taxonomy that classifies LLM-based log analysis research into seven tasks from upstream logging-statement generation through parsing and downstream analysis while cross-cutting common design patterns.
If this is right
- Adoption of the identified design patterns such as retrieval grounding and verification steps can improve output reliability in LLM log analyzers.
- Evaluation practices must incorporate tests for drift and long-tail events to match real deployment conditions.
- Grounding mechanisms become necessary for any operator-facing outputs to maintain faithfulness.
- Reproducibility gaps indicate that shared datasets and standardized benchmarks will accelerate progress.
- Deployment considerations around latency, cost, and privacy point to the value of hybrid systems that combine LLMs with lighter verification layers.
Where Pith is reading between the lines
- The taxonomy offers a ready structure for future reviews that compare LLM methods directly against earlier non-LLM log analysis techniques.
- Privacy and context-length constraints may push development of domain-specific smaller models fine-tuned only on log data.
- Lessons on hallucination risks could transfer to LLM applications in other software engineering tasks such as code review or incident summarization.
- If the design patterns prove stable, they could serve as a template for LLM pipelines that process other forms of semi-structured operational data.
Load-bearing premise
The structured search protocol and manual screening captured a representative and unbiased sample of all relevant LLM-based log analysis papers published up to November 2025.
What would settle it
A later exhaustive search that locates a substantially larger or materially different set of papers on LLM-based log analysis published before November 2025 would show the collection was incomplete.
Figures
read the original abstract
Software systems generate massive, evolving, semi-structured logs that are central to reliability engineering and AIOps, yet difficult to analyze at scale under drift and limited labels. Recent advances in pretrained Transformer models and instruction-tuned large language models (LLMs) have reshaped log analysis by enabling semantic generalization and cross-source evidence integration, but also introducing deployment risks such as context limits, latency/cost, privacy constraints, and hallucinations. This paper presents LLM4Log, a systematic review of LLM-based log analysis across the end-to-end pipeline, from upstream logging-statement generation and maintenance to log parsing/structuring and downstream tasks including anomaly detection, failure prediction, root cause analysis, and log summarization. Following a structured search and manual screening protocol, we completed literature collection in November 2025 and identified 145 unique papers across seven logging tasks. We synthesize the research area through a unified, task-driven taxonomy, summarize common design patterns (prompting/ICL, retrieval grounding, fine-tuning, tool/agent augmentation, and verification), and analyze evaluation practices, datasets, metrics, and reproducibility. Based on these cross-paper analyses, we distill key lessons and open challenges for reliable real-world adoption. We emphasize robustness under drift and long-tail events, grounding and faithfulness for operator-facing outputs, and deployment-oriented designs with verifiable behavior.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript presents LLM4Log, a systematic review of large language model applications to log analysis in software systems. Following a structured search and manual screening protocol completed in November 2025, the authors identify 145 unique papers across seven tasks (logging statement generation/maintenance, parsing/structuring, anomaly detection, failure prediction, root cause analysis, and summarization). They synthesize the literature into a unified task-driven taxonomy, catalog common design patterns (prompting/ICL, retrieval grounding, fine-tuning, tool/agent augmentation, verification), analyze evaluation practices/datasets/metrics/reproducibility, and distill lessons plus open challenges centered on drift robustness, output faithfulness, and deployable designs.
Significance. If the screened corpus is representative, the review supplies a timely consolidation of a fast-growing intersection between LLMs and AIOps/reliability engineering. By extracting cross-paper patterns and explicitly linking them to practical risks (context limits, hallucinations, privacy), it offers researchers a shared reference frame and gives practitioners concrete guidance on moving from prototypes to verifiable production systems. The absence of internal circularity and the direct grounding in the 145-paper corpus strengthen its utility as a field map.
minor comments (3)
- [§3] §3 (Search Protocol): the PRISMA-style flow diagram would be clearer if it explicitly reported the number of papers excluded at each screening stage rather than only final counts.
- [Table 2] Table 2 (Design Patterns): several rows list 'hybrid' approaches without a footnote defining the exact combination criteria, which could confuse readers comparing prompting-only vs. retrieval-augmented entries.
- [§5.3] §5.3 (Reproducibility): the discussion of dataset availability would be strengthened by adding a column or supplementary table indicating which of the 145 papers release code or data.
Simulated Author's Rebuttal
We thank the referee for their positive evaluation of the manuscript and their recommendation to accept. We are pleased that the review recognizes the timeliness of the systematic consolidation of LLM-based log analysis research and its practical value for both researchers and practitioners in AIOps.
Circularity Check
No significant circularity in systematic review synthesis
full rationale
This paper is a systematic literature review that follows standard SE review protocols: structured search, manual screening, and synthesis of 145 external papers into a task-driven taxonomy. No internal mathematical derivations, fitted parameters, self-definitional loops, or load-bearing self-citations exist. The taxonomy, design patterns, and lessons are presented as direct outcomes of the screened external literature rather than reductions to the paper's own inputs. The methodology is self-contained against external benchmarks and does not rely on any circular chain.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Llm-based event log analysis techniques: A survey.arXiv preprint arXiv:2502.00677, 2025
Siraaj Akhtar, Saad Khan, and Simon Parkinson. Llm-based event log analysis techniques: A survey.arXiv preprint arXiv:2502.00677, 2025
-
[2]
Crispin Almodovar, Fariza Sabrina, Sarvnaz Karimi, and Salahuddin Azad. Logfit: Log anomaly detection using fine-tuned language models.IEEE Transactions on Network and Service Management, 21(2):1715–1723, 2024
work page 2024
-
[3]
Apache Software Foundation. Apache log4j 2. Online documentation, 2024. URL https://logging.apache.org/log4j/2.x/
work page 2024
-
[4]
A comparative study on large language models for log parsing
Merve Astekin, Max Hort, and Leon Moonen. A comparative study on large language models for log parsing. InProceedings of the 18th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM 2024), pages 36–47. ACM, 2024. doi: 10.1145/3674805.3686684. URL https://doi.org/10.1145/3674805.3686684
-
[5]
AnomalyExplainerBot: Explainable AI for LLM-based anomaly detection using BERTViz & Captum, 2025
Prasasthy Balasubramanian, Dumindu Kankanamge, Ekaterina Gilman, and Mourad Oussalah. AnomalyExplainerBot: Explainable AI for LLM-based anomaly detection using BERTViz & Captum, 2025
work page 2025
-
[6]
System log parsing with large language models: A review.arXiv preprint arXiv:2504.04877, 2025
Viktor Beck, Max Landauer, Markus Wurzenberger, Florian Skopik, and Andreas Rauber. System log parsing with large language models: A review.arXiv preprint arXiv:2504.04877, 2025. 37
-
[7]
Sidahmed Benabderrahmane, Petko Valtchev, James Cheney, and Talal Rahwan. APT-LLM: Embedding-based anomaly detection of cyber advanced persistent threats using large language models. InProceedings of the 13th International Symposium on Digital Forensics and Security (ISDFS 2025), pages 1–6, 2025. doi: 10.1109/ISDFS65363.2025.11011912
-
[8]
Yonatan Bisk, Rowan Zellers, Ronan Le Bras, Jianfeng Gao, and Yejin Choi
Yoshua Bengio, Patrice Simard, and Paolo Frasconi. Learning long-term dependencies with gradient descent is difficult.IEEE Transactions on Neural Networks, 5(2):157–166, 1994. doi: 10.1109/72.279181
-
[9]
Auto-logging: Ai-centred logging instrumentation
Jasmin Bogatinovski and Odej Kao. Auto-logging: Ai-centred logging instrumentation. In2023 IEEE/ACM 45th International Conference on Software Engineering: New Ideas and Emerging Results (ICSE-NIER), pages 95–100, 2023. doi: 10 .1109/ICSE-NIER58687.2023.00023. URL https://doi.org/10.1109/ICSE-NIER58687.2023.00023
-
[10]
Good enough to learn: LLM-based anomaly detection in ECU logs without reliable labels, 2025
Bogdan Bogdan, Arina Cazacu, and Laura Vasilie. Good enough to learn: LLM-based anomaly detection in ECU logs without reliable labels, 2025. URL https://arxiv.org/abs/2507.01077. Accepted to IEEE Intelligent Vehicles Symposium (IV) 2025
-
[11]
On the Opportunities and Risks of Foundation Models
Rishi Bommasani, Drew A. Hudson, Ehsan Adeli, Russ Altman, Simran Arora, Sydney von Arx, Michael S. Bernstein, Jeannette Bohg, Antoine Bosselut, Emma Brunskill, Erik Brynjolfsson, Shyamal Buch, Dallas Card, Rodrigo Castellon, Niladri Chatterji, Annie Chen, Kathleen Creel, Jared Quincy Davis, Jeff Dean, et al. On the opportunities and risks of foundation m...
work page internal anchor Pith review Pith/arXiv arXiv 2021
-
[12]
Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, et al. Language models are few-shot learners.Advances in neural information processing systems, 33: 1877–1901, 2020
work page 1901
-
[13]
Extracting training data from large language models
Nicholas Carlini, Florian Tramer, Eric Wallace, Matthew Jagielski, Ariel Herbert-Voss, Katherine Lee, Adam Roberts, Tom Brown, Dawn Song, Úlfar Erlingsson, Alina Oprea, and Colin Raffel. Extracting training data from large language models. In30th USENIX Security Symposium (USENIX Security 21), pages 2633–2650, 2021. URL https://www .usenix.org/conference/...
work page 2021
-
[14]
Mark Chen, Jerry Tworek, Heewoo Jun, Qiming Yuan, Henrique Ponde de Oliveira Pinto, Jared Kaplan, Harri Edwards, Yuri Burda, Nicholas Joseph, Greg Brockman, Alex Ray, Raul Puri, Gretchen Krueger, Michael Petrov, Heidy Khlaaf, Girish Sastry, Pamela Mishkin, Brooke Chan, Scott Gray, Nick Ryder, Mikhail Pavlov, Alethea Power, Lukasz Kaiser, Mohammad Bavarian...
work page internal anchor Pith review Pith/arXiv arXiv 2021
-
[15]
Song Chen and Hai Liao. Bert-log: Anomaly detection for system logs based on pre-trained language model.Applied Artificial Intelligence, 36(1):2145642, 2022. doi: 10 .1080/08839514.2022.2145642. URL https://www .tandfonline.com/doi/full/10.1080/08839514.2022.2145642
-
[16]
Epas: Efficient online log parsing via asynchronous scheduling of llm queries
Xiaolei Chen, Jie Shi, Jia Chen, Peng Wang, and Wei Wang. Epas: Efficient online log parsing via asynchronous scheduling of llm queries. InProceedings of the 41st IEEE International Conference on Data Engineering (ICDE 2025), pages 4025–4037. IEEE, 2025. doi: 10.1109/ICDE.2025.00318. URL https://ieeexplore.ieee.org/abstract/document/11113127
-
[17]
Automatic root cause analysis via large language models for cloud incidents
Yinfang Chen, Huaibing Xie, Minghua Ma, Yu Kang, Xin Gao, Liu Shi, Yunjie Cao, Xuedong Gao, Hao Fan, Ming Wen, et al. Automatic root cause analysis via large language models for cloud incidents. InProceedings of the Nineteenth European Conference on Computer Systems, pages 674–688, 2024
work page 2024
-
[18]
Christiano, Jan Leike, Tom Brown, Miljan Martic, Shane Legg, and Dario Amodei
Paul F. Christiano, Jan Leike, Tom Brown, Miljan Martic, Shane Legg, and Dario Amodei. Deep reinforcement learning from human preferences. InAdvances in Neural Information Processing Systems, volume 30, 2017. URL https://papers .neurips.cc/paper/7017-deep- reinforcement-learning-from-human-preferences
work page 2017
-
[19]
Kevin Clark, Minh-Thang Luong, Quoc V Le, and Christopher D Manning. Electra: Pre-training text encoders as discriminators rather than generators.arXiv preprint arXiv:2003.10555, 2020
-
[20]
Clara Corbelle, Victor Carneiro, and Fidel Cacheda. Semantic hierarchical classification applied to anomaly detection using system logs with a bert model.Applied Sciences, 14(13), 2024. ISSN 2076-3417. doi: 10 .3390/app14135388. URL https://www .mdpi.com/2076- 3417/14/13/5388
work page 2024
-
[21]
Aetherlog: Log-based root cause analysis by integrating large language models with knowledge graphs
Tianyu Cui, Ruowei Fu, Changchang Liu, Yuhe Ji, Wenwei Gu, Shenglin Zhang, Yongqian Sun, and Dan Pei. Aetherlog: Log-based root cause analysis by integrating large language models with knowledge graphs. In2025 IEEE 36th International Symposium on Software Reliability Engineering (ISSRE), pages 49–60. IEEE, 2025
work page 2025
-
[22]
Logeval: A comprehensive benchmark suite for llms in log analysis.Empirical Softw
Tianyu Cui, Shiyu Ma, Ziang Chen, Tong Xiao, Chenyu Zhao, Shimin Tao, Yilun Liu, Shenglin Zhang, Duoming Lin, Changchang Liu, Yuzhe Cai, Weibin Meng, Yongqian Sun, and Dan Pei. Logeval: A comprehensive benchmark suite for llms in log analysis.Empirical Softw. Engg., 30(6), October 2025. ISSN 1382-3256. doi: 10 .1007/s10664-025-10701-6. URL https://doi .or...
-
[23]
Logram: Efficient log parsing using𝑛 n-gram dictionaries
Hetong Dai, Heng Li, Che-Shao Chen, Weiyi Shang, and Tse-Hsun Chen. Logram: Efficient log parsing using𝑛 n-gram dictionaries. IEEE Transactions on Software Engineering, 48(3):879–892, 2020
work page 2020
-
[24]
loguru: Python logging made simple
Delgan and contributors. loguru: Python logging made simple. Online documentation, 2024. URL https://github .com/Delgan/loguru. 38
work page 2024
-
[25]
QLoRA: Efficient Finetuning of Quantized LLMs
Tim Dettmers, Artidoro Pagnoni, Ari Holtzman, and Luke Zettlemoyer. Qlora: Efficient finetuning of quantized llms, 2023. URL https://arxiv.org/abs/2305.14314
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[26]
Bert: Pre-training of deep bidirectional transformers for language understanding
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. Bert: Pre-training of deep bidirectional transformers for language understanding. InProceedings of the 2019 conference of the North American chapter of the association for computational linguistics: human language technologies, volume 1 (long and short papers), pages 4171–4186, 2019
work page 2019
-
[27]
Pdlogger: Automated logging framework for practical software development, 2025
Shengcheng Duan, Yihua Xu, Sheng Zhang, Shen Wang, and Yue Duan. Pdlogger: Automated logging framework for practical software development, 2025. URL https://arxiv.org/abs/2507.19951
-
[28]
Unsupervised log parsing based on large language models and entropy
Yiqi Duan, Jianliang Xu, Changyu Fan, and Zixin Liu. Unsupervised log parsing based on large language models and entropy. In2025 11th International Symposium on System Security, Safety, and Reliability (ISSSR), pages 1–10. IEEE, 2025
work page 2025
-
[29]
Early exploration of using ChatGPT for log-based anomaly detection on parallel file systems logs
Chris Egersdoerfer, Di Zhang, and Dong Dai. Early exploration of using ChatGPT for log-based anomaly detection on parallel file systems logs. InProceedings of the 32nd International Symposium on High-Performance Parallel and Distributed Computing (HPDC ’23), pages 315–316. Association for Computing Machinery, 2023. doi: 10.1145/3588195.3595943. URL https:...
-
[30]
Insightai: Root cause analysis in large log files with private data using large language model
Maryam Ekhlasi, Anurag Prakash, Maxime Lamothe, and Michel Dagenais. Insightai: Root cause analysis in large log files with private data using large language model. In2025 IEEE/ACM 4th International Conference on AI Engineering – Software Engineering for AI (CAIN), pages 31–41, 2025. doi: 10.1109/CAIN66642.2025.00012
-
[31]
Mahmoudreza Entezami, Shahabeddin Rahimi Harsini, David Houshangi, and Zahra Entezami. A novel framework for detecting anomalies in network security using LLM and deep learning.Journal of Electrical Systems, 21(1s):294–302, 2025. doi: 10 .52783/jes.8791
work page 2025
-
[32]
Kawin Ethayarajh. How contextual are contextualized word representations? comparing the geometry of BERT, ELMo, and GPT-2 embeddings. InProceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Confere...
-
[33]
European Parliament and Council of the European Union. Regulation (eu) 2016/679 of the european parliament and of the council (general data protection regulation). Official Journal of the European Union, L119, 2016. URL https://eur-lex.europa.eu/eli/reg/2016/679/oj
work page 2016
-
[34]
Log anomaly detection by leveraging llm-based parsing and embedding with attention mechanism
Asma Fariha, Vida Gharavian, Masoud Makrehchi, Shahryar Rahnamayan, Sanaa Alwidian, and Akramul Azim. Log anomaly detection by leveraging llm-based parsing and embedding with attention mechanism. In2024 IEEE Canadian Conference on Electrical and Computer Engineering (CCECE), pages 859–863. IEEE, 2024
work page 2024
-
[35]
Codebert: A pre-trained model for programming and natural languages
Zhangyin Feng, Daya Guo, Duyu Tang, Nan Duan, Xiaocheng Feng, Ming Gong, Linjun Shou, Bing Qin, Ting Liu, Daxin Jiang, and Ming Zhou. Codebert: A pre-trained model for programming and natural languages. InFindings of the Association for Computational Linguistics: EMNLP 2020, pages 1536–1547. Association for Computational Linguistics, 2020. doi: 10 .18653/...
work page 2020
-
[36]
Joint Task Force. Security and privacy controls for information systems and organizations (nist special publication 800-53 revision 5),
-
[37]
URL https://csrc.nist.gov/publications/detail/sp/800-53/rev-5/final
-
[38]
Amy Foster and Selva Kumar.COMBINING LLMS AND SHELL LOGS TO PREDICT BACKUP FAILURES. 07 2025
work page 2025
-
[39]
Where do developers log? an empirical study on logging practices in industry
Qiang Fu, Jieming Zhu, Wenlu Hu, Jian-Guang Lou, Rui Ding, Qingwei Lin, Dongmei Zhang, and Tao Xie. Where do developers log? an empirical study on logging practices in industry. InCompanion Proceedings of the 36th International Conference on Software Engineering, pages 24–33. ACM, 2014. doi: 10.1145/2591062.2591175
-
[40]
End-to-end log statement generation at block-level.Journal of Systems and Software, 216:112146, 2024
Ying Fu, Meng Yan, Pinjia He, Chao Liu, Xiaohong Zhang, and Dan Yang. End-to-end log statement generation at block-level.Journal of Systems and Software, 216:112146, 2024. doi: 10.1016/j.jss.2024.112146
-
[41]
Kai Greshake, Sahar Abdelnabi, Shailesh Mishra, Christopher Endres, Thorsten Holz, and Mario Fritz. Not what you ´ve signed up for: Compromising real-world LLM-integrated applications with indirect prompt injection, 2023. URL https://arxiv.org/abs/2302.12173
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[42]
LogLLM: Log-based anomaly detection using large language models.arXiv preprint,
Wei Guan, Jian Cao, Shiyou Qian, and Jianqi Gao. LogLLM: Log-based anomaly detection using large language models.arXiv preprint,
- [43]
-
[44]
H. Guo, S. Yuan, and X. Wu. Logbert: Log anomaly detection via bert. In2021 International Joint Conference on Neural Networks (IJCNN), pages 1–8. IEEE, 2021
work page 2021
-
[45]
Logformer: A pre-train and tuning pipeline for log anomaly detection
Hongcheng Guo, Jian Yang, Jiaheng Liu, Jiaqi Bai, Boyang Wang, Zhoujun Li, Tieqiao Zheng, Bo Zhang, Junran Peng, and Qi Tian. Logformer: A pre-train and tuning pipeline for log anomaly detection. InProceedings of the AAAI Conference on Artificial Intelligence (AAAI), pages 135–143, 2024
work page 2024
-
[46]
Owl: A large language model for it operations
Hongcheng Guo, Jian Yang, Jiaheng Liu, Liqun Yang, Linzheng Chai, Jiaqi Bai, Junran Peng, Xiaorong Hu, Chao Chen, Dongfeng Zhang, Xu Shi, Tieqiao Zheng, Liangfan Zheng, Bo Zhang, Ke Xu, and Zhoujun Li. Owl: A large language model for it operations. InProceedings of the Twelfth International Conference on Learning Representations (ICLR 2024), 2024. URL htt...
work page 2024
-
[47]
Suchin Gururangan, Ana Marasović, Swabha Swayamdipta, Kyle Lo, Iz Beltagy, Doug Downey, and Noah A. Smith. Don’t stop pretraining: Adapt language models to domains and tasks. InProceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL), pages 8342–8360. Association for Computational Linguistics, 2020. doi: 10 .18653/v1/2...
work page 2020
-
[48]
Fatemeh Hadadi, Qinghua Xu, Domenico Bianculli, and Lionel Briand. Llm meets ml: Data-efficient anomaly detection on unstable logs.ACM Transactions on Software Engineering and Methodology, 2025. doi: 10 .1145/3771283. URL https://doi .org/10.1145/3771283. arXiv:2406.07467
-
[49]
Llmelog: An approach for anomaly detection based on llm-enriched log events
Minghua He, Tong Jia, Chiming Duan, Huaqian Cai, Ying Li, and Gang Huang. Llmelog: An approach for anomaly detection based on llm-enriched log events. In2024 IEEE 35th International Symposium on Software Reliability Engineering (ISSRE), pages 132–143. IEEE, 2024
work page 2024
-
[50]
Pinjia He, Jieming Zhu, Zibin Zheng, and Michael R. Lyu. Drain: An online log parsing approach with fixed depth tree. In2017 IEEE International Conference on Web Services (ICWS), pages 33–40, 2017. doi: 10.1109/ICWS.2017.13
-
[51]
Shilin He, Pinjia He, Zhuangbin Chen, Tianyi Yang, Yuxin Su, and Michael R Lyu. A survey on automated log analysis for reliability engineering.ACM computing surveys (CSUR), 54(6):1–37, 2021
work page 2021
-
[52]
Parameter-efficient log anomaly detection based on pre-training model and lora
Shiming He, Ying Lei, Ying Zhang, Kun Xie, and Pradip Kumar Sharma. Parameter-efficient log anomaly detection based on pre-training model and lora. In2023 IEEE 34th International Symposium on Software Reliability Engineering (ISSRE), pages 207–217. IEEE, 2023
work page 2023
-
[53]
Benchmarking open-source large language models for log level suggestion
Yi Wen Heng, Zeyang Ma, Zhenhao Li, Dong Jae Kim, and Tse-Hsun Chen. Benchmarking open-source large language models for log level suggestion. In2025 IEEE Conference on Software Testing, Verification and Validation (ICST), pages 314–325, 2025. doi: 10.1109/ICST62969.2025.10988921
-
[54]
Diagnosing robotics systems issues with large language models–a case study
Jordis Emilia Herrmann, Aswath Mandakath Gopinath, Mikael Norrlof, and Mark Niklas Mueller. Diagnosing robotics systems issues with large language models–a case study. InICLR 2025 Workshop on Foundation Models in the Wild
work page 2025
-
[55]
Long short-term memory.Neural Computation, 9(8):1735–1780, 1997
Sepp Hochreiter and J"urgen Schmidhuber. Long short-term memory.Neural Computation, 9(8):1735–1780, 1997. doi: 10 .1162/ neco.1997.9.8.1735
work page 1997
-
[56]
Reguly, Kálmán Tornai, Tamás Zsedrovits, and Zoltán Máthé
András Horváth, András Oláh, Attila Pintér, Bálint Siklósi, Gergely Lukács, István Z. Reguly, Kálmán Tornai, Tamás Zsedrovits, and Zoltán Máthé. Anomaly detection algorithms for real-time log data analysis at scale.IEEE Access, 13:136288–136311, 2025. doi: 10.1109/ACCESS.2025.3565575
-
[57]
Research on log anomaly detection based on sentence-BERT.Electronics, 12(17):3580, 2023
Changze Hu, Yu Fang, Jinhua Wu, Haoyang Li, and Geng Wang. Research on log anomaly detection based on sentence-BERT.Electronics, 12(17):3580, 2023. doi: 10.3390/electronics12173580. URL https://www.mdpi.com/2079-9292/12/17/3580
-
[58]
Demystifying and extracting fault-indicating information from logs for failure diagnosis
Junjie Huang, Zhihan Jiang, Jinyang Liu, Yintong Huo, Jiazhen Gu, Zhuangbin Chen, Cong Feng, Hui Dong, Zengyin Yang, and Michael R Lyu. Demystifying and extracting fault-indicating information from logs for failure diagnosis. In2024 IEEE 35th International Symposium on Software Reliability Engineering (ISSRE), pages 511–522. IEEE, 2024
work page 2024
-
[59]
Junjie Huang, Zhihan Jiang, Zhuangbin Chen, and Michael R. Lyu. No more labelled examples? an unsupervised log parser with llms. InProceedings of the ACM on Software Engineering (FSE 2025), number FSE. ACM, 2025. doi: 10 .1145/3729377. URL https: //dl.acm.org/doi/10.1145/3729377
-
[60]
Fung, Rong He, Yining Zhao, Hailong Yang, and Zhongzhi Luan
Shaohan Huang, Yi Liu, Carol J. Fung, Rong He, Yining Zhao, Hailong Yang, and Zhongzhi Luan. Hitanomaly: Hierarchical transformers for anomaly detection in system log.IEEE Transactions on Network and Service Management, 17(2):2064–2076, 2020. doi: 10 .1109/ TNSM.2020.3034647
-
[61]
Fung, He Wang, Hailong Yang, and Zhongzhi Luan
Shaohan Huang, Yi Liu, Carol J. Fung, He Wang, Hailong Yang, and Zhongzhi Luan. Improving log-based anomaly detection by pre-training hierarchical transformers.IEEE Transactions on Computers, 72(9):2656–2667, 2023. doi: 10.1109/TC.2023.3257518
-
[62]
LogRules: Enhancing log analysis capability of large language models through rules
Xin Huang, Ting Zhang, and Wen Zhao. LogRules: Enhancing log analysis capability of large language models through rules. In Luis Chiruzzo, Alan Ritter, and Lu Wang, editors,Findings of the Association for Computational Linguistics: NAACL 2025, pages 452– 470, Albuquerque, New Mexico, April 2025. Association for Computational Linguistics. ISBN 979-8-89176-...
work page 2025
- [63]
- [64]
-
[65]
Jaeyoon Jeong, Insung Baek, Byungwoo Bang, Junyeon Lee, Uiseok Song, and Seoung Bum Kim. Fall: Prior failure detection in large scale system based on language model.IEEE Transactions on Dependable and Secure Computing, 22(1):279–291, 2025. doi: 10.1109/TDSC.2024.3396166
-
[66]
Xin Ji, Le Zhang, Wenya Zhang, Fang Peng, Yifan Mao, Xingchuang Liao, and Kui Zhang. Lemad: LLM-empowered multi-agent system for anomaly detection in power grid services.Electronics, 14(15):3008, 2025. doi: 10 .3390/electronics14153008. URL https: //doi.org/10.3390/electronics14153008
-
[67]
Adapting large language models to log analysis with interpretable domain knowledge
Yuhe Ji, Yilun Liu, Feiyu Yao, Minggui He, Shimin Tao, Xiaofeng Zhao, Chang Su, Xinhua Yang, Weibin Meng, Yuming Xie, Boxing Chen, Shenglin Zhang, and Yongqian Sun. Adapting large language models to log analysis with interpretable domain knowledge. InProceedings of the 34th ACM International Conference on Information and Knowledge Management, CIKM ’25, pa...
-
[68]
Survey of hallucination in natural language generation, 2023
Ziwei Ji, Nayeon Lee, Rita Frieske, Tianle Yu, Dan Su, Yan Xu, Etsuko Ishii, Yejin Bang, Andrea Madotto, and Pascale Fung. Survey of hallucination in natural language generation, 2023. URL https://arxiv.org/abs/2304.04710
-
[69]
Zhihan Jiang, Jinyang Liu, Zhuangbin Chen, Yichen Li, Junjie Huang, Yintong Huo, Pinjia He, Jiazhen Gu, and Michael R. Lyu. Lilac: Log parsing using llms with adaptive parsing cache.Proceedings of the ACM on Software Engineering, 1(FSE), 2024. doi: 10 .1145/3643733. URL https://doi.org/10.1145/3643733
-
[70]
Language Models (Mostly) Know What They Know
Saurav Kadavath, Tom Conerly, Amanda Askell, Tom Henighan, Andy Jones, Nicholas Joseph, Ben Mann, Nova DasSarma, Dawn Drain, Nelson Chen, Yuntao Bai, Jared Kaplan, Sam McCandlish, Dario Amodei, Ethan Chen, and Catherine Olsson. Language models (mostly) know what they know, 2022. URL https://arxiv.org/abs/2207.05221
work page internal anchor Pith review Pith/arXiv arXiv 2022
-
[71]
Scaling Laws for Neural Language Models
Jared Kaplan, Sam McCandlish, Tom Henighan, Tom B. Brown, Benjamin Chess, Rewon Child, Scott Gray, Alec Radford, Jeffrey Wu, and Dario Amodei. Scaling laws for neural language models. arXiv preprint arXiv:2001.08361, 2020. URL https://arxiv .org/abs/2001.08361
work page internal anchor Pith review Pith/arXiv arXiv 2001
-
[72]
Crystal Karlsen, Denis Copstein, Yue Luo, Benjamin Schwartzentruber, Tim Niblett, and Olivier Rouyer. Exploring semantic vs. syntactic features for unsupervised learning on application log files. In2023 International Conference on Cyber Security and Networks (CSNet), 2023. doi: 10.1109/CSNet59123.2023.10339765
-
[73]
Egil Karlsen, Xiao Luo, Nur Zincir-Heywood, and Malcolm Heywood. Benchmarking large language models for log analysis, security, and interpretation.Journal of Network and Systems Management, 32:59, 2024. doi: 10.1007/s10922-024-09831-x
-
[74]
Dense passage retrieval for open-domain question answering
Vladimir Karpukhin, Barlas Oguz, Sewon Min, Patrick Lewis, Ledell Wu, Sergey Edunov, Danqi Chen, and Wen-tau Yih. Dense passage retrieval for open-domain question answering. InProceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 6769–6781. Association for Computational Linguistics, 2020. doi: 10 .18653/v1/2...
work page 2020
-
[75]
Nist special publication 800-92: Guide to computer security log management, 2006
Karen Kent and Murugiah Souppaya. Nist special publication 800-92: Guide to computer security log management, 2006. URL https://csrc.nist.gov/publications/detail/sp/800-92/final
work page 2006
-
[76]
Log-based anomaly detection without log parsing
Van-Hoang Le and Hongyu Zhang. Log-based anomaly detection without log parsing. In2021 36th IEEE/ACM International Conference on Automated Software Engineering (ASE). IEEE/ACM, 2021
work page 2021
-
[77]
Automated program repair in the era of large pre-trained language models
Van-Hoang Le and Hongyu Zhang. Log parsing with prompt-based few-shot learning. InProceedings of the 45th International Conference on Software Engineering (ICSE 2023), pages 2438–2449. IEEE, 2023. doi: 10 .1109/ICSE48619.2023.00204. URL https://conf .researchr.org/ details/icse-2023/icse-2023-technical-track/165/Log-Parsing-with-Prompt-based-Few-shot-Learning
-
[78]
Van-Hoang Le and Hongyu Zhang. Log parsing: How far can chatgpt go? InProceedings of the 38th IEEE/ACM International Conference on Automated Software Engineering, ASE ’23, page 1699–1704. IEEE Press, 2024. ISBN 9798350329964. doi: 10.1109/ASE56229.2023.00206. URL https://doi.org/10.1109/ASE56229.2023.00206
-
[79]
Van-Hoang Le and Hongyu Zhang. Prelog: A pre-trained model for log analytics.Proceedings of the ACM on Management of Data, 2(3): 1–28, 2024. doi: 10.1145/3654966. URL https://dl.acm.org/doi/10.1145/3654966. Presented at SIGMOD 2024
-
[80]
Unleashing the true potential of semantic-based log parsing with pre-trained language models
Van-Hoang Le, Yi Xiao, and Hongyu Zhang. Unleashing the true potential of semantic-based log parsing with pre-trained language models. InProceedings of the 47th International Conference on Software Engineering (ICSE 2025). IEEE/ACM, 2025. URL https://conf .researchr.org/details/icse-2025/icse-2025-research-track/80/Unleashing-the-True-Potential-of-Semanti...
work page 2025
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.