pith. sign in

arxiv: 2602.07303 · v3 · submitted 2026-02-07 · 💻 cs.DB · cs.AI· cs.SE

KRONE: Scalable LLM-Augmented Log Anomaly Detection via Hierarchical Abstraction

Pith reviewed 2026-05-16 06:56 UTC · model grok-4.3

classification 💻 cs.DB cs.AIcs.SE
keywords log anomaly detectionhierarchical abstractionLLM-augmented detectionsequence decompositionscalable monitoringmodular detectionlog hierarchies
0
0 comments X

The pith

KRONE extracts execution hierarchies from flat logs to enable modular anomaly detection with far less LLM usage.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper argues that logs come from nested executions but lose that structure when stored flat, so current detectors miss real dependencies and learn false ones instead. KRONE fixes this by first building an automatic semantic hierarchy of the application's components, then splitting each log sequence into coherent execution units called KRONE Seqs. Detection then runs as many small, independent tasks rather than one giant sequence problem, routing most cases to a fast local detector and only a few to a deeper nested-aware model that can call an LLM for explanation. The result is higher accuracy and F1 score at a fraction of the previous data and compute cost, because the hierarchy makes both learning and inference more targeted.

Core claim

KRONE automatically derives application-specific semantic hierarchies from flat logs, decomposes sequences into modular KRONE Seqs, and applies a hybrid strategy that combines level-independent local-context detection for quick filtering with nested-aware detection that captures cross-level dependencies, using LLM augmentation only on a small routed subset while reusing cached results and early exits along the hierarchy.

What carries the argument

The KRONE Log Abstraction Model, which extracts semantic hierarchies to recursively break flat log sequences into coherent KRONE Seq execution units for modular detection.

If this is right

  • Sequence-level log anomaly detection can be replaced by a set of smaller KRONE Seq-level tasks that each respect execution boundaries.
  • Most detection work can be handled by lightweight local-context models, with LLMs invoked on only 1.1 to 3.3 percent of the data.
  • Cached results and early exits along the hierarchy reduce both memory and compute without sacrificing coverage.
  • Interpretability improves because anomalies can be localized to specific levels and execution units in the hierarchy.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same hierarchy-first decomposition could help other domains where flat sequences hide nested structure, such as network packet traces or call-stack logs.
  • If the abstraction model is retrained on streaming logs, the system could adapt to software updates without full re-labeling.
  • The modular routing may generalize to other hybrid AI pipelines that want to minimize expensive model calls.

Load-bearing premise

That accurate application-specific semantic hierarchies can be extracted automatically from flat logs without heavy domain tuning or labeled hierarchy data.

What would settle it

A test log dataset where the extracted hierarchies group unrelated events together or split true execution units, causing overall F1 score to drop below strong flat-sequence baselines.

Figures

Figures reproduced from arXiv: 2602.07303 by Dennis M. Hofmann, Elke A. Rundensteiner, Jianjun Chen, Jinyang Liu, Lei Cao, Lei Ma, Peter M. VanNostrand, Tieying Zhang.

Figure 1
Figure 1. Figure 1: Hierarchical System, Execution, and Log Anomalies. Low-level [PITH_FULL_IMAGE:figures/full_fig_p001_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: KRONE Log Abstraction Model of Log Data. a) the example log sequence L with log keys and templates. b) the entities, actions, and statuses extracted from the log templates. c) KRONE Tree as the data schema, and KRONE Seqs decomposed from example log sequence L. to the semantics of a given application. This abstraction is grounded in the following key observation: log messages gen￾erally follow a consistent… view at source ↗
Figure 3
Figure 3. Figure 3: KRONE Framework. a) Hierarchical Bottom-up Execution. b) Modular KRONE Seq Detection: Local-Context Strategy with pattern matching. c) Modular KRONE Seq Detection: Nested-Aware Strategy with LLM. efficiency and accuracy. The Local-Context strategy scales efficiently but may overlook cross-level semantics, whereas the Nested-Aware strategy captures semantics at the cost of dependent execution. To integrate … view at source ↗
Figure 5
Figure 5. Figure 5: Re-usability of KRONE S-seqs. Black dashed line: total occurrences of KRONE S-seqs; bar: # unique KRONE S-seqs. ↓: size reduce w.r.t occurrences. TABLE V DETECTION RESULTS OF PATTERN MATCHING WITH OR WITHOUT KRONE HIERARCHY. Dataset Method Precision Recall F-1 HDFS Matching 56.06 99.98 71.84 KRONESAE-P 76.65 99.99 86.78 BGL Matching 35.80 100.0 52.72 KRONESAE-P 87.20 99.99 93.15 ThunderBird Matching 12.48 … view at source ↗
Figure 4
Figure 4. Figure 4: Cardinality reduction achieved by KRONE. Y-axis in log scale. Black dashed line: total training size; FP: # frequent patterns. CP: # closed patterns. MP: # maximal patterns. S-seq: # KRONE S-seqs. ↑: size increase vs. total train size, ↓: size decrease vs. total train size. its effectiveness and structural properties. Due to space con￾straints, we emphasize three key aspects: (1) benefits achieved by KRONE… view at source ↗
Figure 6
Figure 6. Figure 6: Visualization of KRONE Hierarchy. TABLE VI DETECTION PERFORMANCE USING KRONE WITH DIFFERENT EXTRACTED SEMANTIC HIERARCHIES. Dataset Hierarchy Detection P(%)↑ R(%)↑ F-1(%)↑ HDFS hLDA 58.42 99.60 73.64 Raptor 54.18 72.89 62.16 KRONE 80.72 99.31 89.06 BGL hLDA 62.60 99.98 76.99 Raptor 82.73 98.48 89.92 KRONE 92.84 99.97 96.27 ThunderBird hLDA 23.95 100.00 38.65 Raptor 21.07 100.00 34.80 KRONE 60.12 100.00 75.… view at source ↗
Figure 8
Figure 8. Figure 8: Ablation Study of KRONE. KRONESAE-PL equals full KRONE. provided in the [PITH_FULL_IMAGE:figures/full_fig_p011_8.png] view at source ↗
Figure 10
Figure 10. Figure 10: F-1 vs. varying i 10 20 30 40 50 1.25 1.50 # Requests % of # seq HDFS 20 40 60 80 100 2.5 3.0 BGL 20 40 60 80 100 1.5 2.0 ThunderBird 20 40 60 80 100 35 40 IaaS LLM Inference Percentage i [PITH_FULL_IMAGE:figures/full_fig_p012_10.png] view at source ↗
Figure 12
Figure 12. Figure 12: More Visualization of KRONE Hierarchy. LogBert LogAnomaly LogRobust DeepLog OC4SEQ KRONE 0.0 0.5 1.0 F1 HDFS BGL ThunderBird IaaS 0.0 0.5 1.0 Precision 20 40 60 80 Train Percent 0.5 1.0 Recall 20 40 60 80 Train Percent 20 40 60 80 Train Percent 20 40 60 80 Train Percent [PITH_FULL_IMAGE:figures/full_fig_p016_12.png] view at source ↗
Figure 13
Figure 13. Figure 13: F-1 vs. training percentage. B. More Experimental Analysis Hierarchy of KRONE Tree [PITH_FULL_IMAGE:figures/full_fig_p016_13.png] view at source ↗
Figure 14
Figure 14. Figure 14: Examples of hierarchical anomalies detected by K [PITH_FULL_IMAGE:figures/full_fig_p018_14.png] view at source ↗
read the original abstract

Log anomaly detection is crucial for uncovering system failures and security risks. Although logs originate from nested component executions with clear boundaries, this structure is lost when stored as flat sequences. As a result, state-of-the-art methods often miss true dependencies within executions while learning spurious correlations across unrelated events. We propose KRONE, the first hierarchical anomaly detection framework that automatically derives execution hierarchies from flat logs to enable modular, multi-level anomaly detection. At its core, the KRONE Log Abstraction Model extracts application-specific semantic hierarchies, which are used to recursively decompose log sequences into coherent execution units, referred to as KRONE Seqs. This transforms sequence-level detection into a set of modular KRONE Seq-level detection tasks. For each test KRONE Seq, KRONE adopts a hybrid modular detection strategy that routes between an efficient level-independent Local-Context detector for rapid filtering and a Nested-Aware detector that captures cross-level semantic dependencies, augmented with LLM-based anomaly detection and explanation. KRONE further optimizes detection through cached result reuse and early-exit strategies along the hierarchy. Experiments on three public benchmarks and one industrial dataset from ByteDance Cloud demonstrate that KRONE achieves substantial improvements in accuracy (42.49% to 87.98%), F1 score, data efficiency (117.3x reduction), resource efficiency (43.7x reduction), and interpretability. KRONE improves F1-score by 10.07% (82.76% to 92.83%) over prior methods while reducing LLM usage to only 1.1% to 3.3% of the test data. Code: https://github.com/LeiMa0324/KRONE Demo: https://leima0324.github.io/KRONE_Demo_official/

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The paper proposes KRONE, a hierarchical log anomaly detection framework that uses a KRONE Log Abstraction Model to automatically extract application-specific semantic hierarchies from flat logs. These hierarchies enable recursive decomposition of log sequences into coherent KRONE Seqs, which are then processed via a hybrid modular strategy combining an efficient Local-Context detector for filtering, a Nested-Aware detector for cross-level dependencies, LLM-based anomaly detection and explanation, plus caching and early-exit optimizations. Experiments on three public benchmarks and one ByteDance industrial dataset report F1-score gains from 82.76% to 92.83%, accuracy improvements from 42.49% to 87.98%, 117.3x data efficiency, 43.7x resource efficiency, and LLM usage reduced to 1.1-3.3% of test data.

Significance. If the automatic hierarchy extraction is reliable, KRONE could meaningfully advance log anomaly detection by recovering execution structure that flat-sequence methods lose, leading to more modular, interpretable, and efficient detection. The work's strengths include public code release supporting reproducibility, concrete empirical gains across accuracy, efficiency, and LLM reduction on both academic and industrial data, and a hybrid detection design that limits expensive LLM calls. These elements make the contribution potentially influential for structured log analysis if the central assumption holds.

major comments (3)
  1. [Abstract and §3.1] Abstract and §3.1 (KRONE Log Abstraction Model): The headline claims of 10.07% F1 improvement and 117.3x data efficiency rest on the assumption that the automatic hierarchy extraction produces accurate application-specific nesting boundaries and semantic levels; however, the manuscript provides no quantitative validation such as precision/recall of extracted hierarchies against ground-truth or ablation studies measuring the impact of extraction errors on downstream detection.
  2. [§4] §4 (Experiments): The reported performance gains are not accompanied by error analysis or ablation isolating the contribution of the hierarchical decomposition (KRONE Seq) versus LLM routing, caching, or early-exit alone; without such controls it is unclear whether the modular KRONE Seq-level detection is the load-bearing factor for the observed advantages.
  3. [§3.2] §3.2 (KRONE Seq decomposition): The transformation of sequence-level detection into modular KRONE Seq tasks is presented as central, yet no metrics or case studies quantify how often extraction produces incorrect parent-child relations or missed execution units, which would directly undermine the Nested-Aware detector's claimed cross-level semantic benefits.
minor comments (2)
  1. [§2] The definition and formal notation for 'KRONE Seq' appear only after the high-level description; an earlier explicit definition would improve readability.
  2. [§4] Table captions and axis labels in the efficiency plots could more explicitly state the baseline methods being compared to avoid ambiguity.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive and detailed comments. We agree that stronger quantitative validation of the hierarchy extraction and component ablations would improve the paper. We have revised the manuscript to address these points directly while preserving the core claims supported by our end-to-end results.

read point-by-point responses
  1. Referee: [Abstract and §3.1] Abstract and §3.1 (KRONE Log Abstraction Model): The headline claims of 10.07% F1 improvement and 117.3x data efficiency rest on the assumption that the automatic hierarchy extraction produces accurate application-specific nesting boundaries and semantic levels; however, the manuscript provides no quantitative validation such as precision/recall of extracted hierarchies against ground-truth or ablation studies measuring the impact of extraction errors on downstream detection.

    Authors: We agree that direct validation of the extracted hierarchies strengthens the central assumption. Ground-truth semantic hierarchies are unavailable in the public benchmarks and industrial dataset. In the revision we add a human evaluation study (§3.1.1) in which two experts independently annotated hierarchies for 200 sampled sequences across all datasets; KRONE extraction is compared against these annotations. We also add an ablation (§4.4) that injects controlled noise into the extracted relations (5–25% flips) and measures downstream F1 degradation, demonstrating robustness. These additions are now included in the revised manuscript. revision: yes

  2. Referee: [§4] §4 (Experiments): The reported performance gains are not accompanied by error analysis or ablation isolating the contribution of the hierarchical decomposition (KRONE Seq) versus LLM routing, caching, or early-exit alone; without such controls it is unclear whether the modular KRONE Seq-level detection is the load-bearing factor for the observed advantages.

    Authors: We concur that component-level ablations and error analysis are necessary to isolate contributions. The revised §4 now contains a full ablation suite that disables the KRONE Seq decomposition, the Nested-Aware detector, the LLM module, caching, and early-exit in turn, while keeping all other elements fixed. We also add a failure-case analysis that categorizes misclassifications and attributes the majority of baseline errors to missed cross-level dependencies. These results confirm the hierarchical decomposition as the primary driver of gains and are reported in the updated experiments section. revision: yes

  3. Referee: [§3.2] §3.2 (KRONE Seq decomposition): The transformation of sequence-level detection into modular KRONE Seq tasks is presented as central, yet no metrics or case studies quantify how often extraction produces incorrect parent-child relations or missed execution units, which would directly undermine the Nested-Aware detector's claimed cross-level semantic benefits.

    Authors: We acknowledge the value of quantifying extraction fidelity. The revised §3.2 now includes manual inspection metrics on 300 randomly sampled sequences (100 per public benchmark) that report the rate of incorrect parent-child relations and missed execution units. We further add case studies (new Figure 4) contrasting correct and erroneous extractions, together with a sensitivity experiment showing how the Nested-Aware detector performs under varying levels of extraction noise. These quantitative and qualitative additions directly address the concern. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical framework with independent benchmark results

full rationale

The paper presents KRONE as an empirical system/framework for log anomaly detection. It introduces a Log Abstraction Model to derive hierarchies, decomposes sequences into KRONE Seqs, and applies hybrid detectors with LLM augmentation. All performance claims (F1 improvements, efficiency gains) rest on experimental comparisons against baselines on public and industrial datasets. No equations, first-principles derivations, or predictions are described that reduce by construction to fitted parameters, self-citations, or renamed inputs. The hierarchy extraction step is a core modeling choice but is not justified via self-referential definitions or load-bearing self-citations in the abstract or claims. Public code further supports independent verification. This is a standard non-circular empirical contribution.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The central claim rests on the domain assumption that logs contain recoverable nested execution structure and on the empirical performance of the abstraction model and hybrid detectors; no explicit free parameters or invented physical entities are named in the abstract.

axioms (1)
  • domain assumption Logs originate from nested component executions with clear boundaries that are lost when stored as flat sequences.
    Stated directly in the opening of the abstract as the core motivation.
invented entities (1)
  • KRONE Seq no independent evidence
    purpose: Coherent execution unit obtained by recursive decomposition along the derived hierarchy.
    New term introduced to represent the modular units used for detection.

pith-pipeline@v0.9.0 · 5653 in / 1286 out tokens · 31585 ms · 2026-05-16T06:56:40.388253+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. DP-FlogTinyLLM: Differentially private federated log anomaly detection using Tiny LLMs

    cs.CR 2026-04 unverdicted novelty 4.0

    DP-FLogTinyLLM combines federated learning, differential privacy, and LoRA-tuned tiny LLMs to match centralized log anomaly detection performance on Thunderbird and BGL datasets while preserving privacy.

Reference graph

Works this paper leans on

88 extracted references · 88 canonical work pages · cited by 1 Pith paper · 6 internal anchors

  1. [1]

    Deeplog: Anomaly detection and diagnosis from system logs through deep learning,

    M. Du, F. Li, G. Zheng, and V . Srikumar, “Deeplog: Anomaly detection and diagnosis from system logs through deep learning,” inProceedings of the 2017 ACM SIGSAC conference on computer and communications security, 2017, pp. 1285–1298

  2. [2]

    Multi-scale one-class recurrent neural networks for discrete event sequence anomaly detection,

    Z. Wang, Z. Chen, J. Ni, H. Liu, H. Chen, and J. Tang, “Multi-scale one-class recurrent neural networks for discrete event sequence anomaly detection,” inProceedings of the 27th ACM SIGKDD conference on knowledge discovery & data mining, 2021, pp. 3726–3734

  3. [3]

    Robust and transferable log-based anomaly detection,

    P. Jia, S. Cai, B. C. Ooi, P. Wang, and Y . Xiong, “Robust and transferable log-based anomaly detection,”Proceedings of the ACM on Management of Data, vol. 1, no. 1, pp. 1–26, 2023

  4. [4]

    Logbert: Log anomaly detection via bert,

    H. Guo, S. Yuan, and X. Wu, “Logbert: Log anomaly detection via bert,” in2021 international joint conference on neural networks (IJCNN). IEEE, 2021, pp. 1–8

  5. [5]

    Loganomaly: Unsupervised detection of sequential and quantitative anomalies in unstructured logs

    W. Meng, Y . Liu, Y . Zhu, S. Zhang, D. Pei, Y . Liu, Y . Chen, R. Zhang, S. Tao, P. Sunet al., “Loganomaly: Unsupervised detection of sequential and quantitative anomalies in unstructured logs.” inIJCAI, vol. 19, no. 7, 2019, pp. 4739–4745

  6. [6]

    Robust log-based anomaly detection on unstable log data,

    X. Zhang, Y . Xu, Q. Lin, B. Qiao, H. Zhang, Y . Dang, C. Xie, X. Yang, Q. Cheng, Z. Li, J. Chen, X. He, R. Yao, J.-G. Lou, M. Chintalapati, F. Shen, and D. Zhang, “Robust log-based anomaly detection on unstable log data,” inProceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Softw...

  7. [7]

    Cat: Beyond efficient transformer for content-aware anomaly detection in event sequences,

    S. Zhang, Y . Liu, X. Zhang, W. Cheng, H. Chen, and H. Xiong, “Cat: Beyond efficient transformer for content-aware anomaly detection in event sequences,” inProceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, ser. KDD ’22. New York, NY , USA: Association for Computing Machinery, 2022, pp. 4541–4550. [Online]. Available: ht...

  8. [8]

    Pluto: Sample selection for robust anomaly detection on polluted log data,

    L. Ma, L. Cao, P. M. VanNostrand, D. M. Hofmann, Y . Su, and E. A. Rundensteiner, “Pluto: Sample selection for robust anomaly detection on polluted log data,”Proc. ACM Manag. Data, vol. 2, no. 4, Sep

  9. [9]

    Available: https://doi.org/10.1145/3677139

    [Online]. Available: https://doi.org/10.1145/3677139

  10. [10]

    Log-based anomaly detection without log pars- ing,

    V .-H. Le and H. Zhang, “Log-based anomaly detection without log pars- ing,” in2021 36th IEEE/ACM International Conference on Automated Software Engineering (ASE). IEEE, 2021, pp. 492–504

  11. [11]

    Self- attentive classification-based anomaly detection in unstructured logs,

    S. Nedelkoski, J. Bogatinovski, A. Acker, J. Cardoso, and O. Kao, “Self- attentive classification-based anomaly detection in unstructured logs,” in 2020 IEEE International Conference on Data Mining (ICDM), 2020, pp. 1196–1201

  12. [12]

    Logsed: Anomaly diagnosis through mining time-weighted control flow graph in logs,

    T. Jia, L. Yang, P. Chen, Y . Li, F. Meng, and J. Xu, “Logsed: Anomaly diagnosis through mining time-weighted control flow graph in logs,” in2017 IEEE 10th International Conference on Cloud Computing (CLOUD), 2017, pp. 447–455

  13. [13]

    Hitanomaly: Hierarchical transformers for anomaly detection in system log,

    S. Huang, Y . Liu, C. Fung, R. He, Y . Zhao, H. Yang, and Z. Luan, “Hitanomaly: Hierarchical transformers for anomaly detection in system log,”IEEE Transactions on Network and Service Management, vol. 17, no. 4, pp. 2064–2076, 2020

  14. [14]

    Improving log-based anomaly detection by pre-training hierarchical transformers,

    S. Huang, Y . Liu, C. Fung, H. Wang, H. Yang, and Z. Luan, “Improving log-based anomaly detection by pre-training hierarchical transformers,” IEEE Transactions on Computers, vol. 72, no. 9, pp. 2656–2667, 2023

  15. [15]

    Layerlog: Log sequence anomaly detection based on hierarchical semantics,

    C. Zhang, X. Wang, H. Zhang, J. Zhang, H. Zhang, C. Liu, and P. Han, “Layerlog: Log sequence anomaly detection based on hierarchical semantics,”Applied Soft Computing, vol. 132, p. 109860, 2023. [Online]. Available: https://www.sciencedirect.com/science/article/pii/ S1568494622009097

  16. [16]

    Hlogformer: A hierarchical transformer for representing log data,

    Z. Hou, M. Ghashami, M. Kuznetsov, and M. Torkamani, “Hlogformer: A hierarchical transformer for representing log data,” 2024. [Online]. Available: https://arxiv.org/abs/2408.16803

  17. [17]

    Hierarchical transformers are more efficient language models,

    P. Nawrot, S. Tworkowski, M. Tyrolski, L. Kaiser, Y . Wu, C. Szegedy, and H. Michalewski, “Hierarchical transformers are more efficient language models,” 2022. [Online]. Available: https: //arxiv.org/abs/2110.13711

  18. [18]

    Hier- archical transformers for long document classification,

    R. Pappagari, P. ˙Zelasko, J. Villalba, Y . Carmiel, and N. Dehak, “Hierarchical transformers for long document classification,” 2019. [Online]. Available: https://arxiv.org/abs/1910.10781

  19. [19]

    Deeptralog: Trace-log combined microservice anomaly de- tection through graph-based deep learning,

    C. Zhang, X. Peng, C. Sha, K. Zhang, Z. Fu, X. Wu, Q. Lin, and D. Zhang, “Deeptralog: Trace-log combined microservice anomaly de- tection through graph-based deep learning,” in2022 IEEE/ACM 44th International Conference on Software Engineering (ICSE), 2022, pp. 623–634

  20. [20]

    Bridging the gap: Llm-powered transfer learning for log anomaly detection in new software systems,

    Y . Sui, X. Wang, T. Cui, T. Xiao, C. He, S. Zhang, Y . Zhang, X. Yang, Y . Sun, and D. Pei, “Bridging the gap: Llm-powered transfer learning for log anomaly detection in new software systems,” in2025 IEEE 41st International Conference on Data Engineering (ICDE), 2025, pp. 4414– 4427

  21. [21]

    Early exploration of using ChatGPT for log-based anomaly detection on parallel file systems logs

    C. Egersdoerfer, D. Zhang, and D. Dai, “Early exploration of using chatgpt for log-based anomaly detection on parallel file systems logs,” inProceedings of the 32nd International Symposium on High-Performance Parallel and Distributed Computing, ser. HPDC ’23. New York, NY , USA: Association for Computing Machinery, 2023, pp. 315–316. [Online]. Available: ...

  22. [22]

    Large language models for anomaly detection in computational workflows: from supervised fine-tuning to in-context learning,

    H. Jin, G. Papadimitriou, K. Raghavan, P. Zuk, P. Balaprakash, C. Wang, A. Mandal, and E. Deelman, “Large language models for anomaly detection in computational workflows: from supervised fine-tuning to in-context learning,” 2024. [Online]. Available: https: //arxiv.org/abs/2407.17545

  23. [23]

    Llm-based event log analysis techniques: A survey.arXiv preprint arXiv:2502.00677, 2025

    S. Akhtar, S. Khan, and S. Parkinson, “Llm-based event log analysis techniques: A survey,” 2025. [Online]. Available: https: //arxiv.org/abs/2502.00677

  24. [24]

    XRAGLog: A resource-efficient and context-aware log-based anomaly detection method using retrieval-augmented generation,

    L. Zhang, T. Jia, M. Jia, Y . Wu, H. Liu, and Y . Li, “XRAGLog: A resource-efficient and context-aware log-based anomaly detection method using retrieval-augmented generation,” inAAAI 2025 Workshop on Preventing and Detecting LLM Misinformation (PDLM), 2025. [Online]. Available: https://openreview.net/forum?id=8gv7CXuXQ3

  25. [25]

    Interpretable online log analysis using large language models with prompt strategies,

    Y . Liu, S. Tao, W. Meng, J. Wang, W. Ma, Y . Zhao, Y . Chen, H. Yang, Y . Jiang, and X. Chen, “Interpretable online log analysis using large language models with prompt strategies,” 2024. [Online]. Available: https://arxiv.org/abs/2308.07610

  26. [26]

    Beyond the limits: A survey of techniques to extend the context length in large language models,

    X. Wang, M. Salmani, P. Omidi, X. Ren, M. Rezagholizadeh, and A. Eshaghi, “Beyond the limits: A survey of techniques to extend the context length in large language models,” 2024. [Online]. Available: https://arxiv.org/abs/2402.02244

  27. [27]

    CoRR , volume =

    J. Liu, D. Zhu, Z. Bai, Y . He, H. Liao, H. Que, Z. Wang, C. Zhang, G. Zhang, J. Zhang, Y . Zhang, Z. Chen, H. Guo, S. Li, Z. Liu, Y . Shan, Y . Song, J. Tian, W. Wu, Z. Zhou, R. Zhu, J. Feng, Y . Gao, S. He, Z. Li, T. Liu, F. Meng, W. Su, Y . Tan, Z. Wang, J. Yang, W. Ye, B. Zheng, W. Zhou, W. Huang, S. Li, and Z. Zhang, “A comprehensive survey on long c...

  28. [28]

    A survey on deep learning for named entity recognition,

    J. Li, A. Sun, J. Han, and C. Li, “A survey on deep learning for named entity recognition,”IEEE Transactions on Knowledge and Data Engineering, vol. 34, no. 1, pp. 50–70, Jan. 2022. [Online]. Available: http://dx.doi.org/10.1109/TKDE.2020.2981314

  29. [29]

    From Local to Global: A Graph RAG Approach to Query-Focused Summarization

    D. Edge, H. Trinh, N. Cheng, J. Bradley, A. Chao, A. Mody, S. Truitt, D. Metropolitansky, R. O. Ness, and J. Larson, “From local to global: A graph rag approach to query-focused summarization,” 2025. [Online]. Available: https://arxiv.org/abs/2404.16130

  30. [30]

    Automated system monitoring and notification with swatch,

    S. E. Hansen and E. T. Atkins, “Automated system monitoring and notification with swatch,” inProceedings of the 7th USENIX Conference on System Administration, ser. LISA ’93. USA: USENIX Association, 1993, pp. 145–152

  31. [31]

    Refereed papers: Real-time log file analysis using the simple event correlator (sec),

    J. P. Rouillard, “Refereed papers: Real-time log file analysis using the simple event correlator (sec),” inProceedings of the 18th USENIX Conference on System Administration, ser. LISA ’04. USA: USENIX Association, 2004, pp. 133–150

  32. [32]

    Experience report: System log analysis for anomaly detection,

    S. He, J. Zhu, P. He, and M. R. Lyu, “Experience report: System log analysis for anomaly detection,” in2016 IEEE 27th international symposium on software reliability engineering (ISSRE). IEEE, 2016, pp. 207–218

  33. [33]

    A comprehensive study of machine learning techniques for log- based anomaly detection,

    S. Ali, C. Boufaied, D. Bianculli, P. Branco, and L. Briand, “A comprehensive study of machine learning techniques for log- based anomaly detection,”Empirical Software Engineering, vol. 30, no. 5, Jun. 2025. [Online]. Available: http://dx.doi.org/10.1007/ s10664-025-10669-3

  34. [34]

    Log-based anomaly detection without log pars- ing,

    V .-H. Le and H. Zhang, “Log-based anomaly detection without log pars- ing,” in2021 36th IEEE/ACM International Conference on Automated Software Engineering (ASE), 2021, pp. 492–504

  35. [35]

    LLMParser: An Exploratory Study on Using Large Language Models for Log Parsing,

    Z. Ma, A. R. Chen, D. J. Kim, T.-H. Chen, and S. Wang, “Llmparser: An exploratory study on using large language models for log parsing,” inProceedings of the IEEE/ACM 46th International Conference on Software Engineering, ser. ICSE ’24. ACM, Apr. 2024, pp. 1–13. [Online]. Available: http://dx.doi.org/10.1145/3597503.3639150

  36. [36]

    Librelog: Accurate and efficient unsupervised log parsing using open-source large language models,

    Z. Ma, D. J. Kim, and T.-H. Chen, “Librelog: Accurate and efficient unsupervised log parsing using open-source large language models,”

  37. [37]
  38. [38]

    LogGPT: Log anomaly detection via GPT.arXiv preprint, 2023

    X. Han, S. Yuan, and M. Trabelsi, “Loggpt: Log anomaly detection via gpt,” 2023. [Online]. Available: https://arxiv.org/abs/2309.14482

  39. [39]

    Raglog: Log anomaly detection using retrieval augmented generation, 2023

    J. Pan, S. L. Wong, and Y . Yuan, “Raglog: Log anomaly detection using retrieval augmented generation,” 2023. [Online]. Available: https://arxiv.org/abs/2311.05261

  40. [40]

    Lost in the Middle: How Language Models Use Long Contexts

    N. Liu, D. Schuurmans, M. Bosma, and et al., “Lost in the middle: How language models use long contexts,”arXiv preprint arXiv:2307.03172, 2023

  41. [41]

    Deep learning for anomaly detection in log data: A survey,

    M. Landauer, S. Onder, F. Skopik, and M. Wurzenberger, “Deep learning for anomaly detection in log data: A survey,”Machine Learning with Applications, vol. 12, p. 100470, 2023

  42. [42]

    Log-based anomaly detection with deep learning: How far are we?

    V . H. Le and H. Zhang, “Log-based anomaly detection with deep learning: How far are we?” inProceedings of the 44th International Conference on Software Engineering, 2022, pp. 1356–1367

  43. [44]

    A data clustering algorithm for mining patterns from event logs,

    SLCT, “A data clustering algorithm for mining patterns from event logs,” inProceedings of the 3rd IEEE Workshop on IP Operations and Management (IPOM 2003) (IEEE Cat. No.03EX764), 2003, pp. 119– 126

  44. [45]

    Clustering event logs using iterative partitioning,

    A. A. Makanju, A. N. Zincir-Heywood, and E. E. Milios, “Clustering event logs using iterative partitioning,” inProceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ser. KDD ’09. New York, NY , USA: Association for Computing Machinery, 2009, pp. 1255–1264. [Online]. Available: https://doi.org/10.1145/1557019.1557154

  45. [46]

    Execution anomaly detection in distributed systems through unstructured log analysis,

    Q. Fu, J.-G. Lou, Y . Wang, and J. Li, “Execution anomaly detection in distributed systems through unstructured log analysis,” in2009 Ninth IEEE International Conference on Data Mining, 2009, pp. 149–158

  46. [47]

    Spell: Streaming parsing of system event logs,

    M. Du and F. Li, “Spell: Streaming parsing of system event logs,” in 2016 IEEE 16th International Conference on Data Mining (ICDM). IEEE, 2016, pp. 859–864

  47. [48]

    Drain: An online log parsing approach with fixed depth tree,

    P. He, J. Zhu, Z. Zheng, and M. R. Lyu, “Drain: An online log parsing approach with fixed depth tree,” in2017 IEEE international conference on web services (ICWS). IEEE, 2017, pp. 33–40

  48. [49]

    Neural entity linking: A survey of models based on deep learning,

    O. Sevgili, A. Shelmanov, M. Arkhipov, A. Panchenko, and C. Biemann, “Neural entity linking: A survey of models based on deep learning,” Semantic Web, vol. 13, no. 3, pp. 527–570, Apr. 2022. [Online]. Available: http://dx.doi.org/10.3233/SW--222986

  49. [50]

    Gpt-ner: Named entity recognition via large language models,

    S. Wang, X. Sun, X. Li, R. Ouyang, F. Wu, T. Zhang, J. Li, and G. Wang, “Gpt-ner: Named entity recognition via large language models,” 2023. [Online]. Available: https://arxiv.org/abs/2304.10428

  50. [51]

    PromptNER: Prompting for named entity recognition,

    D. Ashok and Z. C. Lipton, “Promptner: Prompting for named entity recognition,” 2023. [Online]. Available: https://arxiv.org/abs/2305.15444

  51. [52]

    llmner: (zero—few)-shot named entity recognition, exploiting the power of large language models,

    F. Villena, L. Miranda, and C. Aracena, “llmner: (zero—few)-shot named entity recognition, exploiting the power of large language models,” 2024. [Online]. Available: https://arxiv.org/abs/2406.04528

  52. [53]

    Language models are few-shot learners,

    T. B. Brown, B. Mann, N. Ryder, M. Subbiah, J. Kaplan, P. Dhariwal, A. Neelakantan, P. Shyam, G. Sastry, A. Askellet al., “Language models are few-shot learners,”Advances in Neural Information Processing Systems, vol. 33, pp. 1877–1901, 2020

  53. [54]

    Rethinking the role of demonstrations: What makes in-context learning work?

    S. Min, M. Lewis, L. Zettlemoyer, and H. Hajishirzi, “Rethinking the role of demonstrations: What makes in-context learning work?” in Transactions of the Association for Computational Linguistics, 2022

  54. [55]

    Chain-of-Thought Prompting Elicits Reasoning in Large Language Models

    J. Wei, X. Wang, D. Schuurmans, M. Bosma, B. Ichter, F. Xia, E. Chi, Q. Le, and D. Zhou, “Chain-of-thought prompting elicits reasoning in large language models,” 2023. [Online]. Available: https://arxiv.org/abs/2201.11903

  55. [56]

    ReAct: Synergizing Reasoning and Acting in Language Models

    S. Yao, J. Zhao, D. Yu, N. Du, I. Shafran, K. Narasimhan, and Y . Cao, “React: Synergizing reasoning and acting in language models,” 2023. [Online]. Available: https://arxiv.org/abs/2210.03629

  56. [57]

    What supercomputers say: A study of five system logs,

    A. Oliner and J. Stearley, “What supercomputers say: A study of five system logs,” in37th annual IEEE/IFIP international conference on dependable systems and networks (DSN’07). IEEE, 2007, pp. 575– 584

  57. [58]

    Detecting large-scale system problems by mining console logs,

    W. Xu, L. Huang, A. Fox, D. Patterson, and M. I. Jordan, “Detecting large-scale system problems by mining console logs,” inProceedings of the ACM SIGOPS 22nd symposium on Operating systems principles, 2009, pp. 117–132

  58. [59]

    Largescale system problem detection by mining console logs,

    W. Xu, L. Huang, A. Fox, D. Patterson, and M. Jordan, “Largescale system problem detection by mining console logs,”Proceedings of SOSP’09, 2009

  59. [60]

    Estimating the support of a high-dimensional distribution,

    B. Sch ¨olkopf, J. C. Platt, J. Shawe-Taylor, A. J. Smola, and R. C. Williamson, “Estimating the support of a high-dimensional distribution,” Neural computation, vol. 13, no. 7, pp. 1443–1471, 2001

  60. [61]

    Isolation forest,

    F. T. Liu, K. M. Ting, and Z.-H. Zhou, “Isolation forest,” in2008 eighth ieee international conference on data mining. IEEE, 2008, pp. 413–422

  61. [62]

    The source code of loglizer,

    “The source code of loglizer,” https://github.com/logpai/loglizer, ac- cessed: 2025-10-26

  62. [63]

    The source code of deep-loglizer,

    “The source code of deep-loglizer,” https://github.com/logpai/ deep-loglizer, accessed: 2025-10-26

  63. [64]

    The source code of oc4seq,

    “The source code of oc4seq,” https://github.com/KnowledgeDiscovery/ OC4Seq, accessed: 2025-10-26

  64. [65]

    The source code of logprompt,

    “The source code of logprompt,” https://github.com/lunyiliu/LogPrompt/ tree/main, accessed: 2025-10-26

  65. [66]

    The source code of logbert,

    “The source code of logbert,” https://github.com/HelenGuohx/logbert, accessed: 2025-10-26

  66. [67]

    Prefixspan,: mining sequential patterns efficiently by prefix- projected pattern growth,

    J. Pei, J. Han, B. Mortazavi-Asl, H. Pinto, Q. Chen, U. Dayal, and M.-C. Hsu, “Prefixspan,: mining sequential patterns efficiently by prefix- projected pattern growth,” inProceedings 17th International Conference on Data Engineering, 2001, pp. 215–224

  67. [68]

    Frequent pattern mining: current status and future directions,

    J. Han, H. Cheng, D. Xin, and X. Yan, “Frequent pattern mining: current status and future directions,”Data Min. Knowl. Discov., vol. 15, no. 1, pp. 55–86, Aug. 2007. [Online]. Available: https://doi.org/10.1007/s10618-006-0059-1

  68. [69]

    Spade: An efficient algorithm for mining frequent se- quences,

    M. J. Zaki, “Spade: An efficient algorithm for mining frequent se- quences,”Machine learning, vol. 42, no. 1, pp. 31–60, 2001

  69. [70]

    Hierarchical topic models and the nested chinese restaurant process,

    T. Griffiths, M. Jordan, J. Tenenbaum, and D. Blei, “Hierarchical topic models and the nested chinese restaurant process,” in Advances in Neural Information Processing Systems, S. Thrun, L. Saul, and B. Sch ¨olkopf, Eds., vol. 16. MIT Press, 2003. [Online]. Available: https://proceedings.neurips.cc/paper files/paper/ 2003/file/7b41bfa5085806dfa24b8c9de0ce...

  70. [71]

    Raptor: Recursive abstractive processing for tree-organized retrieval,

    P. Sarthi, S. Abdullah, A. Tuli, S. Khanna, A. Goldie, and C. D. Manning, “Raptor: Recursive abstractive processing for tree-organized retrieval,” inInternational Conference on Learning Representations (ICLR), 2024

  71. [72]

    Self-Consistency Improves Chain of Thought Reasoning in Language Models

    X. Wang, J. Wei, D. Schuurmans, Q. Le, E. Chi, S. Narang, A. Chowdhery, and D. Zhou, “Self-consistency improves chain of thought reasoning in language models,” 2023. [Online]. Available: https://arxiv.org/abs/2203.11171

  72. [73]

    Cost-effective online multi-llm selection with versatile reward models,

    X. Dai, J. Li, X. Liu, A. Yu, and J. C. S. Lui, “Cost-effective online multi-llm selection with versatile reward models,” 2024. [Online]. Available: https://arxiv.org/abs/2405.16587

  73. [74]

    Optllm: Optimal assignment of queries to large language models,

    Y . Liu, H. Zhang, Y . Miao, V .-H. Le, and Z. Li, “Optllm: Optimal assignment of queries to large language models,” 2024. [Online]. Available: https://arxiv.org/abs/2405.15130

  74. [75]

    Llm bandit: Cost-efficient llm generation via preference-conditioned dynamic routing.arXiv preprint arXiv:2502.02743,

    Y . Li, “Llm bandit: Cost-efficient llm generation via preference- conditioned dynamic routing,” 2025. [Online]. Available: https: //arxiv.org/abs/2502.02743

  75. [76]

    Unique security and privacy threats of large language model: A comprehensive survey,

    S. Wang, T. Zhu, B. Liu, M. Ding, X. Guo, D. Ye, W. Zhou, and P. S. Yu, “Unique security and privacy threats of large language model: A comprehensive survey,” 2024. [Online]. Available: https://arxiv.org/abs/2406.07973

  76. [77]

    Hadi Amini, and Yanzhao Wu

    B. C. Das, M. H. Amini, and Y . Wu, “Security and privacy challenges of large language models: A survey,” 2024. [Online]. Available: https://arxiv.org/abs/2402.00888

  77. [78]

    Privacy issues in large language models: A survey,

    S. Neel and P. Chang, “Privacy issues in large language models: A survey,” 2024. [Online]. Available: https://arxiv.org/abs/2312.06717

  78. [79]

    Sovereign large language models: Advantages, strategy and regulations,

    M. Bondarenko, S. Lushnei, Y . Paniv, O. Molchanovsky, M. Romanyshyn, Y . Filipchuk, and A. Kiulian, “Sovereign large language models: Advantages, strategy and regulations,” 2025. [Online]. Available: https://arxiv.org/abs/2503.04745

  79. [80]

    An empirical survey on long document summarization: Datasets, models, and metrics,

    H. Y . Koh, J. Ju, M. Liu, and S. Pan, “An empirical survey on long document summarization: Datasets, models, and metrics,”ACM Computing Surveys, vol. 55, no. 8, pp. 1–35, Dec. 2022. [Online]. Available: http://dx.doi.org/10.1145/3545176

  80. [81]

    A divide-and-conquer approach to the summarization of long documents,

    A. Gidiotis and G. Tsoumakas, “A divide-and-conquer approach to the summarization of long documents,” 2020. [Online]. Available: https://arxiv.org/abs/2004.06190

Showing first 80 references.