A Pvalue-guided Anomaly Detection Approach Combining Multiple Heterogeneous Log Parser Algorithms on IIoT Systems

Lei Yang; Shenwei Huang; Tao Li; Xueshuo Xie; Xuhang Xiao; Zhi Wang

arxiv: 1907.02765 · v1 · pith:ZUJXUCCDnew · submitted 2019-07-05 · 💻 cs.CR

A Pvalue-guided Anomaly Detection Approach Combining Multiple Heterogeneous Log Parser Algorithms on IIoT Systems

Xueshuo Xie , Zhi Wang , Xuhang Xiao , Lei Yang , Shenwei Huang , Tao Li This is my paper

Pith reviewed 2026-05-25 02:17 UTC · model grok-4.3

classification 💻 cs.CR

keywords anomaly detectionlog parsingp-valueIIoTweighted edit distanceblockchainnonconformity score

0 comments

The pith

P-values derived from weighted edit distance scores across multiple log parsers can effectively recognize abnormal events in IIoT logs.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proposes combining several different log parser algorithms through statistical p-values to detect anomalies in Industrial Internet of Things systems while using blockchain to keep logs tamper-proof. Nonconformity scores are computed with weighted edit distance between each log and predefined events, then turned into p-values that measure how well a log fits an event. The method is applied to large sets of real HDFS logs and IIoT logs. A reader would care because IIoT environments face persistent threats and logs have not been reliably used for detection so far. The central claim is that this p-value guidance produces effective anomaly recognition.

Core claim

Abnormal events could be effectively recognized by the pvalue-guided approach that combines multiple heterogeneous log parser algorithms. Weighted edit distance serves as the score function to calculate nonconformity scores between a log and a predefined event. P-values are then derived from those scores to indicate match quality, and the overall method is validated on real-world HDFS logs and IIoT logs.

What carries the argument

The pvalue-guided combination of heterogeneous log parsers, where weighted edit distance produces nonconformity scores that are converted into p-values measuring how well each log matches a known event.

If this is right

Abnormal events in IIoT systems become detectable by statistically combining outputs from several log parsers.
Blockchain can protect log integrity as a prerequisite for the anomaly detection pipeline.
The same p-value approach works on both HDFS logs and IIoT logs without modification.
Nonconformity scores based on weighted edit distance supply the numeric input needed for the p-value calculation.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The approach may extend to other log-heavy domains such as cloud service monitoring if the same parser combination logic holds.
Using multiple parsers could lower reliance on any single parser's weaknesses, though this is not tested in the paper.
A direct follow-up test would be to measure detection latency and compare it against single-parser baselines on streaming IIoT data.

Load-bearing premise

P-values from weighted edit distance nonconformity scores across multiple heterogeneous log parsers can reliably distinguish anomalies without dataset-specific tuning or post-hoc adjustments.

What would settle it

Apply the pvalue-guided method to the IIoT log dataset and observe whether it fails to flag a substantial share of documented abnormal events or requires per-dataset tuning to reach usable detection rates.

Figures

Figures reproduced from arXiv: 1907.02765 by Lei Yang, Shenwei Huang, Tao Li, Xueshuo Xie, Xuhang Xiao, Zhi Wang.

**Figure 1.** Figure 1: The core of the pvalue-guided approach includes log parser, non-conformal measure and pvalue-guided anomaly [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗

**Figure 2.** Figure 2: The 13 log parser algorithms used on the HDFS [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗

**Figure 3.** Figure 3: The p values calculate by four log parser algorithms. For the normal log message, each algorithm gives a high p-values of the match template; but for the abnormal log message, all algorithms gives a low p-values of each template in template set. B. The anomaly detection case on HDFS 2k In this part, we mainly test the accuracy of our approach using a public HDFS dataset, that is, whether pvalue can accurat… view at source ↗

read the original abstract

Industrial Internet of Things (IIoT) is becoming an attack target of advanced persistent threat (APT). Currently, IIoT logs have not been effectively used for anomaly detection. In this paper, we use blockchain to prevent logs from being tampered with and propose a pvalue-guided anomaly detection approach. This approach uses statistical pvalues to combine multiple heterogeneous log parser algorithms. The weighted edit distance is selected as a score function to calculate the nonconformity score between a log and a predefined event. The pvalue is calculated based on the non-conformity scores which indicate how well a log matches an event. This approach is tested on a large number of real-world HDFS logs and IIoT logs. The experiment results show that abnormal events could be effectively recognized by our pvalue-guided approach.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This paper combines existing log parsers with p-values from weighted edit distance for IIoT anomaly detection but supplies no numbers or calibration checks to support the claims.

read the letter

The main takeaway is that the paper describes a p-value approach to fuse several heterogeneous log parsers, using weighted edit distance as the nonconformity score and adding blockchain for log integrity, then tests the idea on HDFS and real IIoT traces. It claims abnormal events are effectively recognized, but that is the extent of the evidence given in the abstract and description. The combination itself is not a first-principles advance; it applies standard statistical tools and known parsers to a new domain. The blockchain element is a practical addition for tamper resistance in industrial settings. The setup for handling varied log formats through multiple parsers is reasonable on paper. The central weakness is the absence of any quantitative results, baseline comparisons, error rates, or derivation of how p-values are aggregated across parsers. The free parameters in the edit distance are not shown to be independent of the evaluation data, so the nonconformity scores and resulting p-values could be overfit rather than calibrated. The stress-test point about needing uniformity under the null or robustness to parser choice holds up; nothing in the description addresses sensitivity or validity of the combined statistic. This work is aimed at practitioners doing applied log analysis for IIoT security who might want to try mixing parsers. A reader already working on conformal-style methods for logs could extract the experimental setup, but the paper offers little for someone seeking validated or novel results. I would not bring it to a reading group, would not cite it, and would not send it for peer review without the missing numbers and calibration analysis.

Referee Report

3 major / 2 minor

Summary. The manuscript proposes a p-value-guided anomaly detection approach for IIoT systems that combines multiple heterogeneous log parser algorithms. It employs blockchain to prevent log tampering and uses weighted edit distance as a nonconformity score to compute p-values indicating how well a log matches a predefined event. The method is tested on real-world HDFS and IIoT logs, with the claim that abnormal events can be effectively recognized.

Significance. If the p-values derived from the nonconformity scores are shown to be well-calibrated and the parser combination is robust without dataset-specific tuning, the work could offer a statistically principled way to aggregate heterogeneous log parsers for anomaly detection in industrial settings, strengthening defenses against APTs while preserving log integrity via blockchain.

major comments (3)

[Abstract] Abstract: the claim that 'abnormal events could be effectively recognized by our pvalue-guided approach' supplies no quantitative results, error bars, baseline comparisons, precision/recall metrics, or tables of performance numbers, making it impossible to verify the central empirical claim against the data.
[Method] p-value calculation (method section): no explicit formula or derivation is given for how nonconformity scores from weighted edit distance are converted to p-values or aggregated across parsers; without this it is unclear whether the combined statistic preserves validity or produces uniform p-values under the null.
[Experiments] Experiments: the weighted edit distance parameters (parser combination weights) are listed as free parameters yet no calibration check, sensitivity analysis to parser choice, or demonstration that they are independent of the HDFS/IIoT evaluation data is provided, directly affecting the reliability of the anomaly flagging claim.

minor comments (2)

[Method] Notation for the nonconformity score and p-value aggregation should be introduced with explicit equations rather than prose descriptions only.
[Abstract] The abstract would be strengthened by a single sentence summarizing the key quantitative outcome (e.g., F1-score or detection rate) even if full tables appear later.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive feedback. We address each major comment below and will revise the manuscript to improve clarity and empirical support.

read point-by-point responses

Referee: [Abstract] Abstract: the claim that 'abnormal events could be effectively recognized by our pvalue-guided approach' supplies no quantitative results, error bars, baseline comparisons, precision/recall metrics, or tables of performance numbers, making it impossible to verify the central empirical claim against the data.

Authors: We agree that the abstract should include quantitative support. In the revision we will add specific performance metrics (precision, recall, F1) from the HDFS and IIoT experiments together with baseline comparisons. revision: yes
Referee: [Method] p-value calculation (method section): no explicit formula or derivation is given for how nonconformity scores from weighted edit distance are converted to p-values or aggregated across parsers; without this it is unclear whether the combined statistic preserves validity or produces uniform p-values under the null.

Authors: We accept this criticism. The current manuscript omits an explicit formula. We will insert a dedicated derivation subsection showing the nonconformity-to-p-value mapping and the aggregation rule across parsers, including discussion of validity under the null. revision: yes
Referee: [Experiments] Experiments: the weighted edit distance parameters (parser combination weights) are listed as free parameters yet no calibration check, sensitivity analysis to parser choice, or demonstration that they are independent of the HDFS/IIoT evaluation data is provided, directly affecting the reliability of the anomaly flagging claim.

Authors: We agree that additional validation is required. The revision will contain a sensitivity analysis over the parser weights, a calibration check, and explicit tests confirming that the chosen weights do not overfit the evaluation datasets. revision: yes

Circularity Check

0 steps flagged

No circularity in derivation chain

full rationale

The provided abstract and description outline a p-value-guided method using weighted edit distance for nonconformity scores across log parsers, but contain no equations, parameter-fitting steps, self-citations, or derivations that reduce the claimed anomaly detection to inputs by construction. No load-bearing self-references or fitted-input predictions are exhibited. The approach is presented as a combination of standard statistical and parsing techniques tested on external logs, making the central claim self-contained against the given text.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

Abstract-only review yields minimal visibility into parameters or assumptions; the central claim rests on unstated statistical properties of log data and parser heterogeneity.

free parameters (1)

parser combination weights
Required to fuse heterogeneous parsers but unspecified in abstract

axioms (1)

domain assumption P-values computed from edit-distance nonconformity scores across parsers indicate anomalies
Invoked as the core of the pvalue-guided method

pith-pipeline@v0.9.0 · 5680 in / 1104 out tokens · 28121 ms · 2026-05-25T02:17:20.552542+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

33 extracted references · 33 canonical work pages · 3 internal anchors

[1]

Blockchain platform for industrial internet of things,

A. Bahga and V . K. Madisetti, “Blockchain platform for industrial internet of things,” Journal of Software Engineering and Applications , vol. 9, no. 10, p. 533, 2016

work page 2016
[2]

Blockchain-based platform architecture for industrial iot,

N. Teslya and I. Ryabchikov, “Blockchain-based platform architecture for industrial iot,” in Open Innovations Association (FRUCT), 2017 21st Conference of. IEEE, 2017, pp. 321–329

work page 2017
[3]

Execution anomaly detection in distributed systems through unstructured log analysis,

Q. Fu, J.-G. Lou, Y . Wang, and J. Li, “Execution anomaly detection in distributed systems through unstructured log analysis,” in Data Mining,

work page
[4]

Ninth IEEE International Conference on

ICDM’09. Ninth IEEE International Conference on . IEEE, 2009, pp. 149–158

work page 2009
[5]

A lightweight algorithm for message type extraction in system application logs,

A. Makanju, A. N. Zincir-Heywood, and E. E. Milios, “A lightweight algorithm for message type extraction in system application logs,” IEEE Transactions on Knowledge & Data Engineering , vol. 24, no. 11, pp. 1921–1936, 2012

work page 1921
[6]

Logsig: Generating system events from raw textual logs,

L. Tang, T. Li, and C.-S. Perng, “Logsig: Generating system events from raw textual logs,” in Proceedings of the 20th ACM international conference on Information and knowledge management . ACM, 2011, pp. 785–794

work page 2011
[7]

Drain: An online log parsing approach with ﬁxed depth tree,

P. He, J. Zhu, Z. Zheng, and M. R. Lyu, “Drain: An online log parsing approach with ﬁxed depth tree,” in Web Services (ICWS), 2017 IEEE International Conference on . IEEE, 2017, pp. 33–40

work page 2017
[8]

Logcluster-a data clustering and pattern mining algorithm for event logs,

R. Vaarandi and M. Pihelgas, “Logcluster-a data clustering and pattern mining algorithm for event logs,” in Network and Service Management (CNSM), 2015 11th International Conference on . IEEE, 2015, pp. 1–7

work page 2015
[9]

Fingerprinting the datacenter: automated classiﬁcation of performance crises,

P. Bodik, M. Goldszmidt, A. Fox, D. B. Woodard, and H. Andersen, “Fingerprinting the datacenter: automated classiﬁcation of performance crises,” in Proceedings of the 5th European conference on Computer systems. ACM, 2010, pp. 111–124

work page 2010
[10]

Failure diagnosis using decision trees,

M. Chen, A. X. Zheng, J. Lloyd, M. I. Jordan, and E. Brewer, “Failure diagnosis using decision trees,” in International Conference on Autonomic Computing, 2004. Proceedings. IEEE, 2004, pp. 36–43

work page 2004
[11]

Failure prediction in ibm bluegene/l event logs,

Y . Liang, Y . Zhang, H. Xiong, and R. Sahoo, “Failure prediction in ibm bluegene/l event logs,” in Seventh IEEE International Conference on Data Mining (ICDM 2007) . IEEE, 2007, pp. 583–588

work page 2007
[12]

Isolation forest,

F. T. Liu, K. M. Ting, and Z.-H. Zhou, “Isolation forest,” in 2008 Eighth IEEE International Conference on Data Mining . IEEE, 2008, pp. 413– 422

work page 2008
[13]

Largescale system problem detection by mining console logs,

W. Xu, L. Huang, A. Fox, D. Patterson, and M. Jordan, “Largescale system problem detection by mining console logs,” Proceedings of SOSP09, 2009

work page 2009
[14]

Mining invariants from console logs for system problem detection,

J. G. Lou, Q. Fu, S. Yang, Y . Xu, and J. Li, “Mining invariants from console logs for system problem detection,” Proc of Usenix Atc , pp. 231–244, 2010

work page 2010
[15]

Log clustering based problem identiﬁcation for online service systems,

Q. Lin, H. Zhang, J. G. Lou, Z. Yu, and X. Chen, “Log clustering based problem identiﬁcation for online service systems,” in IEEE/ACM International Conference on Software Engineering Companion , 2016

work page 2016
[16]

Deeplog: Anomaly detection and diagnosis from system logs through deep learning,

D. Min, F. Li, G. Zheng, and V . Srikumar, “Deeplog: Anomaly detection and diagnosis from system logs through deep learning,” in Acm Sigsac Conference on Computer & Communications Security , 2017

work page 2017
[17]

Anomaly detection using autoencoders in high performance computing systems,

A. Borghesi, A. Bartolini, M. Lombardi, M. Milano, and L. Benini, “Anomaly detection using autoencoders in high performance computing systems,” 2018

work page 2018
[18]

Experience report: system log analysis for anomaly detection,

S. He, J. Zhu, P. He, and M. R. Lyu, “Experience report: system log analysis for anomaly detection,” in Software Reliability Engineering (ISSRE), 2016 IEEE 27th International Symposium on . IEEE, 2016, pp. 207–218

work page 2016
[19]

An end-to-end log management framework for distributed systems,

P. He, “An end-to-end log management framework for distributed systems,” in 2017 IEEE 36th Symposium on Reliable Distributed Systems (SRDS). IEEE, 2017, pp. 266–267

work page 2017
[20]

Leveraging existing instrumentation to automatically infer invariant- constrained models,

I. Beschastnikh, Y . Brun, S. Schneider, M. Sloan, and M. D. Ernst, “Leveraging existing instrumentation to automatically infer invariant- constrained models,” in Proceedings of the 19th ACM SIGSOFT sym- posium and the 13th European conference on F oundations of software engineering. ACM, 2011, pp. 267–277

work page 2011
[21]

Assisting developers of big data analytics applications when deploying on hadoop clouds,

W. Shang, Z. M. Jiang, H. Hemmati, B. Adams, A. E. Hassan, and P. Martin, “Assisting developers of big data analytics applications when deploying on hadoop clouds,” in Software Engineering (ICSE), 2013 35th International Conference on . IEEE, 2013, pp. 402–411

work page 2013
[22]

Abstracting log lines to log event types for mining software system logs,

M. Nagappan and M. A. V ouk, “Abstracting log lines to log event types for mining software system logs,” in Mining Software Repositories (MSR), 2010 7th IEEE Working Conference on . IEEE, 2010, pp. 114– 117

work page 2010
[23]

Logmine: fast pattern recognition for log analytics,

H. Hamooni, B. Debnath, J. Xu, H. Zhang, G. Jiang, and A. Mueen, “Logmine: fast pattern recognition for log analytics,” in Proceedings of the 25th ACM International on Conference on Information and Knowledge Management. ACM, 2016, pp. 1573–1582

work page 2016
[24]

A Directed Acyclic Graph Approach to Online Log Parsing

P. He, J. Zhu, P. Xu, Z. Zheng, and M. R. Lyu, “A directed acyclic graph approach to online log parsing,” arXiv preprint arXiv:1806.04356, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018
[25]

An evaluation study on log parsing and its use in log mining,

P. He, J. Zhu, S. He, J. Li, and M. R. Lyu, “An evaluation study on log parsing and its use in log mining,” in 2016 46th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN) . IEEE, 2016, pp. 654–661

work page 2016
[26]

Tools and Benchmarks for Automated Log Parsing

J. Zhu, S. He, J. Liu, P. He, Q. Xie, Z. Zheng, and M. R. Lyu, “Tools and benchmarks for automated log parsing,” arXiv preprint arXiv:1811.03509, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018
[27]

A tutorial on conformal prediction,

G. Shafer and V . V ovk, “A tutorial on conformal prediction,” Journal of Machine Learning Research , vol. 9, no. Mar, pp. 371–421, 2008

work page 2008
[28]

Plug-in martingales for testing exchangeability on-line

V . Fedorova, A. Gammerman, I. Nouretdinov, and V . V ovk, “Plug- in martingales for testing exchangeability on-line,” arXiv preprint arXiv:1204.3251, 2012. 7

work page internal anchor Pith review Pith/arXiv arXiv 2012
[29]

Transcend: Detecting concept drift in malware classiﬁcation models,

R. Jordaney, K. Sharad, S. K. Dash, Z. Wang, D. Papini, I. Nouretdinov, and L. Cavallaro, “Transcend: Detecting concept drift in malware classiﬁcation models,” in PROCEEDINGS OF THE 26TH USENIX SECURITY SYMPOSIUM (USENIX SECURITY’17) . USENIX Associ- ation, 2017, pp. 625–642

work page 2017
[30]

An application of blockchain and smart contracts for machine-to-machine communications in cyber-physical production systems,

M. Y . Afanasev, Y . V . Fedosov, A. A. Krylova, and S. A. Shorokhov, “An application of blockchain and smart contracts for machine-to-machine communications in cyber-physical production systems,” in 2018 IEEE Industrial Cyber-Physical Systems (ICPS) . IEEE, 2018, pp. 13–19

work page 2018
[31]

Abstracting execution logs to execution events for enterprise applications (short paper),

Z. M. Jiang, A. E. Hassan, P. Flora, and G. Hamann, “Abstracting execution logs to execution events for enterprise applications (short paper),” in Quality Software, 2008. QSIC’08. The Eighth International Conference on. IEEE, 2008, pp. 181–186

work page 2008
[32]

An automated approach for abstracting execution logs to execution events,

Z. M. Jiang, A. E. Hassan, G. Hamann, and P. Flora, “An automated approach for abstracting execution logs to execution events,” Journal of Software Maintenance and Evolution: Research and Practice , vol. 20, no. 4, pp. 249–267, 2008

work page 2008
[33]

Spell: Streaming parsing of system event logs,

D. Min and F. Li, “Spell: Streaming parsing of system event logs,” in IEEE International Conference on Data Mining , 2017

work page 2017

[1] [1]

Blockchain platform for industrial internet of things,

A. Bahga and V . K. Madisetti, “Blockchain platform for industrial internet of things,” Journal of Software Engineering and Applications , vol. 9, no. 10, p. 533, 2016

work page 2016

[2] [2]

Blockchain-based platform architecture for industrial iot,

N. Teslya and I. Ryabchikov, “Blockchain-based platform architecture for industrial iot,” in Open Innovations Association (FRUCT), 2017 21st Conference of. IEEE, 2017, pp. 321–329

work page 2017

[3] [3]

Execution anomaly detection in distributed systems through unstructured log analysis,

Q. Fu, J.-G. Lou, Y . Wang, and J. Li, “Execution anomaly detection in distributed systems through unstructured log analysis,” in Data Mining,

work page

[4] [4]

Ninth IEEE International Conference on

ICDM’09. Ninth IEEE International Conference on . IEEE, 2009, pp. 149–158

work page 2009

[5] [5]

A lightweight algorithm for message type extraction in system application logs,

A. Makanju, A. N. Zincir-Heywood, and E. E. Milios, “A lightweight algorithm for message type extraction in system application logs,” IEEE Transactions on Knowledge & Data Engineering , vol. 24, no. 11, pp. 1921–1936, 2012

work page 1921

[6] [6]

Logsig: Generating system events from raw textual logs,

L. Tang, T. Li, and C.-S. Perng, “Logsig: Generating system events from raw textual logs,” in Proceedings of the 20th ACM international conference on Information and knowledge management . ACM, 2011, pp. 785–794

work page 2011

[7] [7]

Drain: An online log parsing approach with ﬁxed depth tree,

P. He, J. Zhu, Z. Zheng, and M. R. Lyu, “Drain: An online log parsing approach with ﬁxed depth tree,” in Web Services (ICWS), 2017 IEEE International Conference on . IEEE, 2017, pp. 33–40

work page 2017

[8] [8]

Logcluster-a data clustering and pattern mining algorithm for event logs,

R. Vaarandi and M. Pihelgas, “Logcluster-a data clustering and pattern mining algorithm for event logs,” in Network and Service Management (CNSM), 2015 11th International Conference on . IEEE, 2015, pp. 1–7

work page 2015

[9] [9]

Fingerprinting the datacenter: automated classiﬁcation of performance crises,

P. Bodik, M. Goldszmidt, A. Fox, D. B. Woodard, and H. Andersen, “Fingerprinting the datacenter: automated classiﬁcation of performance crises,” in Proceedings of the 5th European conference on Computer systems. ACM, 2010, pp. 111–124

work page 2010

[10] [10]

Failure diagnosis using decision trees,

M. Chen, A. X. Zheng, J. Lloyd, M. I. Jordan, and E. Brewer, “Failure diagnosis using decision trees,” in International Conference on Autonomic Computing, 2004. Proceedings. IEEE, 2004, pp. 36–43

work page 2004

[11] [11]

Failure prediction in ibm bluegene/l event logs,

Y . Liang, Y . Zhang, H. Xiong, and R. Sahoo, “Failure prediction in ibm bluegene/l event logs,” in Seventh IEEE International Conference on Data Mining (ICDM 2007) . IEEE, 2007, pp. 583–588

work page 2007

[12] [12]

Isolation forest,

F. T. Liu, K. M. Ting, and Z.-H. Zhou, “Isolation forest,” in 2008 Eighth IEEE International Conference on Data Mining . IEEE, 2008, pp. 413– 422

work page 2008

[13] [13]

Largescale system problem detection by mining console logs,

W. Xu, L. Huang, A. Fox, D. Patterson, and M. Jordan, “Largescale system problem detection by mining console logs,” Proceedings of SOSP09, 2009

work page 2009

[14] [14]

Mining invariants from console logs for system problem detection,

J. G. Lou, Q. Fu, S. Yang, Y . Xu, and J. Li, “Mining invariants from console logs for system problem detection,” Proc of Usenix Atc , pp. 231–244, 2010

work page 2010

[15] [15]

Log clustering based problem identiﬁcation for online service systems,

Q. Lin, H. Zhang, J. G. Lou, Z. Yu, and X. Chen, “Log clustering based problem identiﬁcation for online service systems,” in IEEE/ACM International Conference on Software Engineering Companion , 2016

work page 2016

[16] [16]

Deeplog: Anomaly detection and diagnosis from system logs through deep learning,

D. Min, F. Li, G. Zheng, and V . Srikumar, “Deeplog: Anomaly detection and diagnosis from system logs through deep learning,” in Acm Sigsac Conference on Computer & Communications Security , 2017

work page 2017

[17] [17]

Anomaly detection using autoencoders in high performance computing systems,

A. Borghesi, A. Bartolini, M. Lombardi, M. Milano, and L. Benini, “Anomaly detection using autoencoders in high performance computing systems,” 2018

work page 2018

[18] [18]

Experience report: system log analysis for anomaly detection,

S. He, J. Zhu, P. He, and M. R. Lyu, “Experience report: system log analysis for anomaly detection,” in Software Reliability Engineering (ISSRE), 2016 IEEE 27th International Symposium on . IEEE, 2016, pp. 207–218

work page 2016

[19] [19]

An end-to-end log management framework for distributed systems,

P. He, “An end-to-end log management framework for distributed systems,” in 2017 IEEE 36th Symposium on Reliable Distributed Systems (SRDS). IEEE, 2017, pp. 266–267

work page 2017

[20] [20]

Leveraging existing instrumentation to automatically infer invariant- constrained models,

I. Beschastnikh, Y . Brun, S. Schneider, M. Sloan, and M. D. Ernst, “Leveraging existing instrumentation to automatically infer invariant- constrained models,” in Proceedings of the 19th ACM SIGSOFT sym- posium and the 13th European conference on F oundations of software engineering. ACM, 2011, pp. 267–277

work page 2011

[21] [21]

Assisting developers of big data analytics applications when deploying on hadoop clouds,

W. Shang, Z. M. Jiang, H. Hemmati, B. Adams, A. E. Hassan, and P. Martin, “Assisting developers of big data analytics applications when deploying on hadoop clouds,” in Software Engineering (ICSE), 2013 35th International Conference on . IEEE, 2013, pp. 402–411

work page 2013

[22] [22]

Abstracting log lines to log event types for mining software system logs,

M. Nagappan and M. A. V ouk, “Abstracting log lines to log event types for mining software system logs,” in Mining Software Repositories (MSR), 2010 7th IEEE Working Conference on . IEEE, 2010, pp. 114– 117

work page 2010

[23] [23]

Logmine: fast pattern recognition for log analytics,

H. Hamooni, B. Debnath, J. Xu, H. Zhang, G. Jiang, and A. Mueen, “Logmine: fast pattern recognition for log analytics,” in Proceedings of the 25th ACM International on Conference on Information and Knowledge Management. ACM, 2016, pp. 1573–1582

work page 2016

[24] [24]

A Directed Acyclic Graph Approach to Online Log Parsing

P. He, J. Zhu, P. Xu, Z. Zheng, and M. R. Lyu, “A directed acyclic graph approach to online log parsing,” arXiv preprint arXiv:1806.04356, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018

[25] [25]

An evaluation study on log parsing and its use in log mining,

P. He, J. Zhu, S. He, J. Li, and M. R. Lyu, “An evaluation study on log parsing and its use in log mining,” in 2016 46th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN) . IEEE, 2016, pp. 654–661

work page 2016

[26] [26]

Tools and Benchmarks for Automated Log Parsing

J. Zhu, S. He, J. Liu, P. He, Q. Xie, Z. Zheng, and M. R. Lyu, “Tools and benchmarks for automated log parsing,” arXiv preprint arXiv:1811.03509, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018

[27] [27]

A tutorial on conformal prediction,

G. Shafer and V . V ovk, “A tutorial on conformal prediction,” Journal of Machine Learning Research , vol. 9, no. Mar, pp. 371–421, 2008

work page 2008

[28] [28]

Plug-in martingales for testing exchangeability on-line

V . Fedorova, A. Gammerman, I. Nouretdinov, and V . V ovk, “Plug- in martingales for testing exchangeability on-line,” arXiv preprint arXiv:1204.3251, 2012. 7

work page internal anchor Pith review Pith/arXiv arXiv 2012

[29] [29]

Transcend: Detecting concept drift in malware classiﬁcation models,

R. Jordaney, K. Sharad, S. K. Dash, Z. Wang, D. Papini, I. Nouretdinov, and L. Cavallaro, “Transcend: Detecting concept drift in malware classiﬁcation models,” in PROCEEDINGS OF THE 26TH USENIX SECURITY SYMPOSIUM (USENIX SECURITY’17) . USENIX Associ- ation, 2017, pp. 625–642

work page 2017

[30] [30]

An application of blockchain and smart contracts for machine-to-machine communications in cyber-physical production systems,

M. Y . Afanasev, Y . V . Fedosov, A. A. Krylova, and S. A. Shorokhov, “An application of blockchain and smart contracts for machine-to-machine communications in cyber-physical production systems,” in 2018 IEEE Industrial Cyber-Physical Systems (ICPS) . IEEE, 2018, pp. 13–19

work page 2018

[31] [31]

Abstracting execution logs to execution events for enterprise applications (short paper),

Z. M. Jiang, A. E. Hassan, P. Flora, and G. Hamann, “Abstracting execution logs to execution events for enterprise applications (short paper),” in Quality Software, 2008. QSIC’08. The Eighth International Conference on. IEEE, 2008, pp. 181–186

work page 2008

[32] [32]

An automated approach for abstracting execution logs to execution events,

Z. M. Jiang, A. E. Hassan, G. Hamann, and P. Flora, “An automated approach for abstracting execution logs to execution events,” Journal of Software Maintenance and Evolution: Research and Practice , vol. 20, no. 4, pp. 249–267, 2008

work page 2008

[33] [33]

Spell: Streaming parsing of system event logs,

D. Min and F. Li, “Spell: Streaming parsing of system event logs,” in IEEE International Conference on Data Mining , 2017

work page 2017