System Misuse Detection via Informed Behavior Clustering and Modeling

Linara Adilova; Livin Natious; Michael Kamp; Olivier Thonnard; Siming Chen

arxiv: 1907.00874 · v1 · pith:ARPOMAVPnew · submitted 2019-07-01 · 💻 cs.CR · cs.LG

System Misuse Detection via Informed Behavior Clustering and Modeling

Linara Adilova , Livin Natious , Siming Chen , Olivier Thonnard , Michael Kamp This is my paper

Pith reviewed 2026-05-25 11:49 UTC · model grok-4.3

classification 💻 cs.CR cs.LG

keywords misuse detectioninformed machine learningLSTMbehavior modelingvisual clusteringcybersecurityanomaly detectionsystem logs

0 comments

The pith

Expert-identified clusters from a visual interface enable LSTM models to capture normal system behavior for misuse detection.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a method to detect fraudulent interactions with computer systems by first modeling what normal behavior looks like. Security experts use an interactive visual tool to group logged interactions into meaningful clusters that carry domain knowledge. These clusters then train LSTM neural networks in an informed way, producing models that are more precise than those trained on raw ungrouped data. The approach is demonstrated on real logs from an administrative login and security server, where the informed models successfully represent normal activity and can therefore highlight deviations.

Core claim

Informed modeling that incorporates expert-defined clusters of interactions via a visual interface produces LSTM behavior models capable of capturing normal system interactions, which can then be applied to detect abnormal behavior in security logs.

What carries the argument

Interactive visual interface for expert clustering of interaction logs, used to create informed training sets for LSTM neural networks that model normal behavior.

If this is right

The informed clusters allow the LSTM to learn tighter representations of legitimate interaction sequences.
Models built this way can flag sessions that deviate from the learned normal patterns as potential misuse.
The visual interface reduces the manual review burden on experts by turning their domain knowledge into structured training data.
The same workflow can be repeated on new log sources to adapt the detection models to different systems.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The method might extend to other log-heavy domains such as database access monitoring if similar visual clustering interfaces are built.
If the visual clusters prove stable across different expert users, the approach could support semi-automated retraining pipelines when new normal behavior patterns emerge.
Combining the cluster-informed LSTMs with simpler rule-based checks could create hybrid detectors that are easier to audit.

Load-bearing premise

Security experts can reliably spot semantically meaningful groups of interactions in the visual interface, and those groups will yield LSTM models that are more precise than models trained without the clustering step.

What would settle it

A direct comparison on the same login-server log dataset showing that LSTM models trained on the expert clusters achieve no higher precision or recall in distinguishing normal from abnormal sessions than models trained on the unclustered logs.

Figures

Figures reproduced from arXiv: 1907.00874 by Linara Adilova, Livin Natious, Michael Kamp, Olivier Thonnard, Siming Chen.

**Figure 2.** Figure 2: Diagram of the proposed approach. The training phase can be repeated [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗

**Figure 3.** Figure 3: Lengths distribution of the sessions. The longest session consists of [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗

**Figure 5.** Figure 5: Clusters are organized in ascending order by size. The accuracy [PITH_FULL_IMAGE:figures/full_fig_p005_5.png] view at source ↗

**Figure 4.** Figure 4: Comparison of the test accuracy of cluster models calculated on the [PITH_FULL_IMAGE:figures/full_fig_p005_4.png] view at source ↗

**Figure 6.** Figure 6: Development of scores predicted by OC-SVMs per action. We compare [PITH_FULL_IMAGE:figures/full_fig_p006_6.png] view at source ↗

**Figure 8.** Figure 8: Normality estimation in terms of likelihood and average loss suffered [PITH_FULL_IMAGE:figures/full_fig_p006_8.png] view at source ↗

**Figure 9.** Figure 9: Normality estimation in terms of likelihood and average loss suffered [PITH_FULL_IMAGE:figures/full_fig_p007_9.png] view at source ↗

**Figure 10.** Figure 10: Clusters are organized in ascending order by size. The loss achieved [PITH_FULL_IMAGE:figures/full_fig_p009_10.png] view at source ↗

**Figure 12.** Figure 12: Normality estimation in terms of average loss suffered at each action [PITH_FULL_IMAGE:figures/full_fig_p009_12.png] view at source ↗

**Figure 11.** Figure 11: Normality estimation in terms of average likelihood of each action [PITH_FULL_IMAGE:figures/full_fig_p009_11.png] view at source ↗

read the original abstract

One of the main tasks of cybersecurity is recognizing malicious interactions with an arbitrary system. Currently, the logging information from each interaction can be collected in almost unrestricted amounts, but identification of attacks requires a lot of effort and time of security experts. We propose an approach for identifying fraud activity through modeling normal behavior in interactions with a system via machine learning methods, in particular LSTM neural networks. In order to enrich the modeling with system specific knowledge, we propose to use an interactive visual interface that allows security experts to identify semantically meaningful clusters of interactions. These clusters incorporate domain knowledge and lead to more precise behavior modeling via informed machine learning. We evaluate the proposed approach on a dataset containing logs of interactions with an administrative interface of login and security server. Our empirical results indicate that the informed modeling is capable of capturing normal behavior, which can then be used to detect abnormal behavior.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper describes an expert-in-the-loop approach to inform LSTM models with visual clusters for log-based misuse detection, but provides no quantitative results or comparisons to support the benefit of the informed modeling.

read the letter

The paper's main contribution is using an interactive visual interface to let experts cluster system interactions, then using those clusters to inform LSTM training for modeling normal behavior and detecting anomalies in cybersecurity logs. It does a solid job describing a workflow that mixes human domain knowledge with standard neural sequence modeling, which addresses a real pain point where pure data-driven approaches can miss context. The evaluation claims success on administrative login logs, showing the models can capture normal patterns. Where it falls short is the lack of any reported metrics, baselines, or ablations. There's no direct comparison showing that the expert-informed clusters produce better models than training without them, which is the key assumption. The paper also doesn't detail how the clusters are incorporated into the LSTM process. This makes it difficult to assess if the approach delivers on its promise or if the results are just from the LSTM itself. It's aimed at practitioners and researchers in applied security ML who deal with log data and want to integrate expert feedback. Someone looking for ideas on visual analytics in this space might find it useful, but the thin evaluation limits its impact. I would send this to peer review because the idea is grounded in a practical problem and the method is clearly described, even if the current evidence is preliminary. A referee could push for the necessary comparisons and numbers.

Referee Report

3 major / 1 minor

Summary. The manuscript presents an approach for system misuse detection by modeling normal user behavior with LSTM neural networks. Domain knowledge is incorporated by allowing security experts to identify semantically meaningful clusters of interactions using an interactive visual interface. These clusters are then used to create informed LSTM models. The method is evaluated on a dataset of logs from the administrative interface of a login and security server, claiming that the informed models can capture normal behavior to detect anomalies.

Significance. If validated, this approach could contribute to the field by bridging visual analytics, expert knowledge, and deep learning for cybersecurity applications. It offers a way to leverage limited expert time more effectively in labeling or clustering log data for anomaly detection. The integration of human-in-the-loop clustering with LSTM modeling is a promising direction, though the current lack of detailed empirical support reduces its immediate impact.

major comments (3)

[Abstract] The abstract asserts positive empirical results on the login/security server dataset, but the manuscript supplies no metrics, baselines, ablation studies, or details on how the clusters are incorporated into the LSTM training.
[Evaluation] No quantitative evaluation is presented to support the claim that informed clustering produces more precise models than standard training; the load-bearing assumption that expert clusters improve precision is not directly tested with comparisons.
[Method] The description of the informed machine learning process lacks specifics on the mechanism by which cluster information modifies the LSTM training procedure, such as whether clusters define separate models, input features, or regularization terms.

minor comments (1)

[Introduction] Some notation for the LSTM architecture or clustering could be clarified for readers unfamiliar with the specific implementation.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the detailed and constructive feedback. We address each major comment below. Where the manuscript is missing required details or comparisons, we will revise to strengthen the presentation and empirical support.

read point-by-point responses

Referee: [Abstract] The abstract asserts positive empirical results on the login/security server dataset, but the manuscript supplies no metrics, baselines, ablation studies, or details on how the clusters are incorporated into the LSTM training.

Authors: We agree the abstract is too high-level. The evaluation section contains the supporting experiments, but we will revise the abstract to report concrete metrics (e.g., anomaly detection F1 scores), name the baselines, and briefly state that clusters are used to train separate per-cluster LSTM models. revision: yes
Referee: [Evaluation] No quantitative evaluation is presented to support the claim that informed clustering produces more precise models than standard training; the load-bearing assumption that expert clusters improve precision is not directly tested with comparisons.

Authors: The current manuscript reports only qualitative observations on the login/security server logs. We accept that direct quantitative comparisons (informed vs. uninformed LSTM, with and without expert clusters) are required to substantiate the central claim and will add these results, including ablation tables, in the revised evaluation section. revision: yes
Referee: [Method] The description of the informed machine learning process lacks specifics on the mechanism by which cluster information modifies the LSTM training procedure, such as whether clusters define separate models, input features, or regularization terms.

Authors: We will expand the method section with a precise description: expert-provided clusters are used to partition the training data and train one LSTM per cluster; at inference an interaction is routed to the nearest cluster model. We will include the exact training objective, input representation, and any regularization that incorporates cluster membership. revision: yes

Circularity Check

0 steps flagged

No significant circularity

full rationale

The paper proposes an interactive visual interface for experts to identify clusters of system interactions, which are then used as input to train LSTM models for normal behavior. This relies on external expert judgment and standard neural network training rather than any self-referential derivation, fitted parameter renamed as prediction, or self-citation chain. No load-bearing step reduces to its own inputs by construction; the method is self-contained against external benchmarks like login server logs.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim depends on the domain assumption that expert clusters add useful structure; no free parameters, invented entities, or additional axioms are described in the abstract.

axioms (1)

domain assumption Security experts can identify semantically meaningful clusters of interactions using an interactive visual interface that improve subsequent LSTM modeling
This premise is required for the informed modeling to outperform standard approaches and is invoked in the abstract description of the method.

pith-pipeline@v0.9.0 · 5682 in / 1194 out tokens · 51205 ms · 2026-05-25T11:49:12.235291+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

29 extracted references · 29 canonical work pages · 2 internal anchors

[1]

Outside the closed world: On using machine learning for network intrusion detection,

R. Sommer and V . Paxson, “Outside the closed world: On using machine learning for network intrusion detection,” in 2010 IEEE symposium on security and privacy. IEEE, 2010, pp. 305–316

work page 2010
[2]

Machine learning and deep learning methods for cybersecurity,

Y . Xin, L. Kong, Z. Liu, Y . Chen, Y . Li, H. Zhu, M. Gao, H. Hou, and C. Wang, “Machine learning and deep learning methods for cybersecurity,” IEEE Access, vol. 6, pp. 35 365–35 381, 2018

work page 2018
[3]

Confederation of fcm clustering, ann and svm techniques to implement hybrid nids using corrected kdd cup 99 dataset,

A. Chandrasekhar and K. Raghuveer, “Confederation of fcm clustering, ann and svm techniques to implement hybrid nids using corrected kdd cup 99 dataset,” in 2014 International Conference on Communication and Signal Processing. IEEE, 2014, pp. 672–676

work page 2014
[4]

Design of intelli- gent knn-based alarm ﬁlter using knowledge-based alert veriﬁcation in intrusion detection,

W. Meng, W. Li, and L.-F. Kwok, “Design of intelli- gent knn-based alarm ﬁlter using knowledge-based alert veriﬁcation in intrusion detection,” Security and Commu- nication Networks, vol. 8, no. 18, pp. 3883–3895, 2015

work page 2015
[5]

LSTM-Based System-Call Language Modeling and Robust Ensemble Method for Designing Host-Based Intrusion Detection Systems

G. Kim, H. Yi, J. Lee, Y . Paek, and S. Yoon, “Lstm- based system-call language modeling and robust ensem- ble method for designing host-based intrusion detection systems,” arXiv preprint arXiv:1611.01726 , 2016

work page internal anchor Pith review Pith/arXiv arXiv 2016
[6]

Understanding user be- haviour through action sequences: from the usual to the unusual,

P. H. Nguyen, C. Turkay, G. Andrienko, N. Andrienko, O. Thonnard, and J. Zouaoui, “Understanding user be- haviour through action sequences: from the usual to the unusual,” IEEE transactions on visualization and computer graphics, 2018

work page 2018
[7]

Modeling human be- havior to anticipate insider attacks,

F. L. Greitzer and R. E. Hohimer, “Modeling human be- havior to anticipate insider attacks,” Journal of Strategic Security, vol. 4, no. 2, pp. 25–48, 2011

work page 2011
[8]

Iden- tifying suspicious user behavior with neural networks,

M. Ussath, D. Jaeger, F. Cheng, and C. Meinel, “Iden- tifying suspicious user behavior with neural networks,” in 2017 IEEE 4th International Conference on Cyber Security and Cloud Computing (CSCloud). IEEE, 2017, pp. 255–263

work page 2017
[9]

User modelling for ex- clusion and anomaly detection: a behavioural intrusion detection system,

G. Pannell and H. Ashman, “User modelling for ex- clusion and anomaly detection: a behavioural intrusion detection system,” in International Conference on User Modeling, Adaptation, and Personalization . Springer, 2010, pp. 207–218

work page 2010
[10]

Anomaly-based in- trusion detection in software as a service,

G. Nascimento and M. Correia, “Anomaly-based in- trusion detection in software as a service,” in 2011 IEEE/IFIP 41st International Conference on Dependable Systems and Networks Workshops (DSN-W) . IEEE, 2011, pp. 19–24

work page 2011
[11]

Anomaly detection of web- based attacks,

C. Kruegel and G. Vigna, “Anomaly detection of web- based attacks,” in Proceedings of the 10th ACM confer- ence on Computer and communications security . ACM, 2003, pp. 251–261

work page 2003
[12]

Host-based intrusion detection using dynamic and static behavioral models,

D.-Y . Yeung and Y . Ding, “Host-based intrusion detection using dynamic and static behavioral models,” Pattern recognition, vol. 36, no. 1, pp. 229–243, 2003

work page 2003
[13]

Applica- tion of deep recurrent neural networks for prediction of user behavior in tor networks,

T. Ishitaki, R. Obukata, T. Oda, and L. Barolli, “Applica- tion of deep recurrent neural networks for prediction of user behavior in tor networks,” in 2017 31st International Conference on Advanced Information Networking and Applications Workshops (WAINA). IEEE, 2017, pp. 238– 243

work page 2017
[14]

L. C. Jain and L. R. Medsker, Recurrent Neural Net- works: Design and Applications , 1st ed. Boca Raton, FL, USA: CRC Press, Inc., 1999

work page 1999
[15]

Long short -term memory,

S. Hochreiter and J. Schmidhuber, “Long short- term memory,” Neural Comput. , vol. 9, no. 8, pp. 1735–1780, Nov. 1997. [Online]. Available: http: //dx.doi.org/10.1162/neco.1997.9.8.1735

work page doi:10.1162/neco.1997.9.8.1735 1997
[16]

Behavioral anomaly detection of malware on home routers,

N. An, A. Duff, G. Naik, M. Faloutsos, S. Weber, and S. Mancoridis, “Behavioral anomaly detection of malware on home routers,” in 2017 12th International Conference on Malicious and Unwanted Software (MAL- WARE). IEEE, 2017, pp. 47–54

work page 2017
[17]

Recurrent neural network language models for open vocabulary event-level cyber anomaly detection,

A. R. Tuor, R. Baerwolf, N. Knowles, B. Hutchinson, N. Nichols, and R. Jasper, “Recurrent neural network language models for open vocabulary event-level cyber anomaly detection,” in Workshops at the Thirty-Second AAAI Conference on Artiﬁcial Intelligence , 2018

work page 2018
[18]

A neural probabilistic language model,

Y . Bengio, R. Ducharme, P. Vincent, and C. Janvin, “A neural probabilistic language model,” J. Mach. Learn. Res. , vol. 3, pp. 1137–1155, Mar. 2003. [Online]. Available: http://dl.acm.org/citation.cfm?id= 944919.944966

work page arXiv 2003
[19]

Recurrent neural network based language model

T. Mikolov, M. Karaﬁt, L. Burget, J. Cernock, and S. Khudanpur, “Recurrent neural network based language model.” in INTERSPEECH, T. Kobayashi, K. Hirose, and S. Nakamura, Eds. ISCA, 2010, pp. 1045–1048. [Online]. Available: http://dblp.uni-trier.de/db/conf/ interspeech/interspeech2010.html#MikolovKBCK10

work page 2010
[20]

Lstm neural networks for language modeling,

M. Sundermeyer, R. Schl ¨uter, and H. Ney, “Lstm neural networks for language modeling,” in INTERSPEECH, 2012

work page 2012
[21]

Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation

Y . Wu, M. Schuster, Z. Chen, Q. V . Le, M. Norouzi, W. Macherey, M. Krikun, Y . Cao, Q. Gao, K. Macherey et al. , “Google’s neural machine translation system: Bridging the gap between human and machine transla- tion,” arXiv preprint arXiv:1609.08144 , 2016

work page internal anchor Pith review Pith/arXiv arXiv 2016
[22]

An empirical study of smoothing techniques for language modeling,

S. F. Chen and J. Goodman, “An empirical study of smoothing techniques for language modeling,” in Proceedings of the 34th Annual Meeting on Association for Computational Linguistics, ser. ACL ’96. Stroudsburg, PA, USA: Association for Computational Linguistics, 1996, pp. 310–318. [Online]. Available: https://doi.org/10.3115/981863.981904

work page doi:10.3115/981863.981904 1996
[23]

Early-stage mal- ware prediction using recurrent neural networks,

M. Rhode, P. Burnap, and K. Jones, “Early-stage mal- ware prediction using recurrent neural networks,” com- puters & security , vol. 77, pp. 578–594, 2018

work page 2018
[24]

LDA ensembles for interactive exploration and categorization of behaviors,

S. Chen, N. Andrienko, G. Andrienko, L. Adilova, J. Bar- let, J. Kindermann, P. H. Nguyen, O. Thonnard, and C. Turkay, “LDA ensembles for interactive exploration and categorization of behaviors,” IEEE Transactions on Visualization and Computer Graphics , 2019

work page 2019
[25]

Latent dirichlet allocation,

D. M. Blei, A. Y . Ng, and M. I. Jordan, “Latent dirichlet allocation,” Journal of machine Learning research, vol. 3, no. Jan, pp. 993–1022, 2003

work page 2003
[26]

Support vector method for novelty detection,

B. Sch ¨olkopf, R. C. Williamson, A. J. Smola, J. Shawe- Taylor, and J. C. Platt, “Support vector method for novelty detection,” in Advances in neural information processing systems, 2000, pp. 582–588

work page 2000
[27]

Real-time computer network anomaly detection using machine learning techniques,

K. Limthong, “Real-time computer network anomaly detection using machine learning techniques,” Journal of Advances in Computer Networks , vol. 1, no. 1, 2013

work page 2013
[28]

Dropout: a simple way to pre- vent neural networks from overﬁtting,

N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov, “Dropout: a simple way to pre- vent neural networks from overﬁtting,” The Journal of Machine Learning Research , vol. 15, no. 1, pp. 1929– 1958, 2014

work page 1929
[29]

Generation of a new ids test dataset: Time to retire the kdd collection,

G. Creech and J. Hu, “Generation of a new ids test dataset: Time to retire the kdd collection,” in 2013 IEEE Wireless Communications and Networking Conference (WCNC). IEEE, 2013, pp. 4487–4492. VI. A PPENDIX Per cluster evaluation in terms of loss values achieved by models is shown in Figure 10. Fig. 10. Clusters are organized in ascending order by size. ...

work page 2013

[1] [1]

Outside the closed world: On using machine learning for network intrusion detection,

R. Sommer and V . Paxson, “Outside the closed world: On using machine learning for network intrusion detection,” in 2010 IEEE symposium on security and privacy. IEEE, 2010, pp. 305–316

work page 2010

[2] [2]

Machine learning and deep learning methods for cybersecurity,

Y . Xin, L. Kong, Z. Liu, Y . Chen, Y . Li, H. Zhu, M. Gao, H. Hou, and C. Wang, “Machine learning and deep learning methods for cybersecurity,” IEEE Access, vol. 6, pp. 35 365–35 381, 2018

work page 2018

[3] [3]

Confederation of fcm clustering, ann and svm techniques to implement hybrid nids using corrected kdd cup 99 dataset,

A. Chandrasekhar and K. Raghuveer, “Confederation of fcm clustering, ann and svm techniques to implement hybrid nids using corrected kdd cup 99 dataset,” in 2014 International Conference on Communication and Signal Processing. IEEE, 2014, pp. 672–676

work page 2014

[4] [4]

Design of intelli- gent knn-based alarm ﬁlter using knowledge-based alert veriﬁcation in intrusion detection,

W. Meng, W. Li, and L.-F. Kwok, “Design of intelli- gent knn-based alarm ﬁlter using knowledge-based alert veriﬁcation in intrusion detection,” Security and Commu- nication Networks, vol. 8, no. 18, pp. 3883–3895, 2015

work page 2015

[5] [5]

LSTM-Based System-Call Language Modeling and Robust Ensemble Method for Designing Host-Based Intrusion Detection Systems

G. Kim, H. Yi, J. Lee, Y . Paek, and S. Yoon, “Lstm- based system-call language modeling and robust ensem- ble method for designing host-based intrusion detection systems,” arXiv preprint arXiv:1611.01726 , 2016

work page internal anchor Pith review Pith/arXiv arXiv 2016

[6] [6]

Understanding user be- haviour through action sequences: from the usual to the unusual,

P. H. Nguyen, C. Turkay, G. Andrienko, N. Andrienko, O. Thonnard, and J. Zouaoui, “Understanding user be- haviour through action sequences: from the usual to the unusual,” IEEE transactions on visualization and computer graphics, 2018

work page 2018

[7] [7]

Modeling human be- havior to anticipate insider attacks,

F. L. Greitzer and R. E. Hohimer, “Modeling human be- havior to anticipate insider attacks,” Journal of Strategic Security, vol. 4, no. 2, pp. 25–48, 2011

work page 2011

[8] [8]

Iden- tifying suspicious user behavior with neural networks,

M. Ussath, D. Jaeger, F. Cheng, and C. Meinel, “Iden- tifying suspicious user behavior with neural networks,” in 2017 IEEE 4th International Conference on Cyber Security and Cloud Computing (CSCloud). IEEE, 2017, pp. 255–263

work page 2017

[9] [9]

User modelling for ex- clusion and anomaly detection: a behavioural intrusion detection system,

G. Pannell and H. Ashman, “User modelling for ex- clusion and anomaly detection: a behavioural intrusion detection system,” in International Conference on User Modeling, Adaptation, and Personalization . Springer, 2010, pp. 207–218

work page 2010

[10] [10]

Anomaly-based in- trusion detection in software as a service,

G. Nascimento and M. Correia, “Anomaly-based in- trusion detection in software as a service,” in 2011 IEEE/IFIP 41st International Conference on Dependable Systems and Networks Workshops (DSN-W) . IEEE, 2011, pp. 19–24

work page 2011

[11] [11]

Anomaly detection of web- based attacks,

C. Kruegel and G. Vigna, “Anomaly detection of web- based attacks,” in Proceedings of the 10th ACM confer- ence on Computer and communications security . ACM, 2003, pp. 251–261

work page 2003

[12] [12]

Host-based intrusion detection using dynamic and static behavioral models,

D.-Y . Yeung and Y . Ding, “Host-based intrusion detection using dynamic and static behavioral models,” Pattern recognition, vol. 36, no. 1, pp. 229–243, 2003

work page 2003

[13] [13]

Applica- tion of deep recurrent neural networks for prediction of user behavior in tor networks,

T. Ishitaki, R. Obukata, T. Oda, and L. Barolli, “Applica- tion of deep recurrent neural networks for prediction of user behavior in tor networks,” in 2017 31st International Conference on Advanced Information Networking and Applications Workshops (WAINA). IEEE, 2017, pp. 238– 243

work page 2017

[14] [14]

L. C. Jain and L. R. Medsker, Recurrent Neural Net- works: Design and Applications , 1st ed. Boca Raton, FL, USA: CRC Press, Inc., 1999

work page 1999

[15] [15]

Long short -term memory,

S. Hochreiter and J. Schmidhuber, “Long short- term memory,” Neural Comput. , vol. 9, no. 8, pp. 1735–1780, Nov. 1997. [Online]. Available: http: //dx.doi.org/10.1162/neco.1997.9.8.1735

work page doi:10.1162/neco.1997.9.8.1735 1997

[16] [16]

Behavioral anomaly detection of malware on home routers,

N. An, A. Duff, G. Naik, M. Faloutsos, S. Weber, and S. Mancoridis, “Behavioral anomaly detection of malware on home routers,” in 2017 12th International Conference on Malicious and Unwanted Software (MAL- WARE). IEEE, 2017, pp. 47–54

work page 2017

[17] [17]

Recurrent neural network language models for open vocabulary event-level cyber anomaly detection,

A. R. Tuor, R. Baerwolf, N. Knowles, B. Hutchinson, N. Nichols, and R. Jasper, “Recurrent neural network language models for open vocabulary event-level cyber anomaly detection,” in Workshops at the Thirty-Second AAAI Conference on Artiﬁcial Intelligence , 2018

work page 2018

[18] [18]

A neural probabilistic language model,

Y . Bengio, R. Ducharme, P. Vincent, and C. Janvin, “A neural probabilistic language model,” J. Mach. Learn. Res. , vol. 3, pp. 1137–1155, Mar. 2003. [Online]. Available: http://dl.acm.org/citation.cfm?id= 944919.944966

work page arXiv 2003

[19] [19]

Recurrent neural network based language model

T. Mikolov, M. Karaﬁt, L. Burget, J. Cernock, and S. Khudanpur, “Recurrent neural network based language model.” in INTERSPEECH, T. Kobayashi, K. Hirose, and S. Nakamura, Eds. ISCA, 2010, pp. 1045–1048. [Online]. Available: http://dblp.uni-trier.de/db/conf/ interspeech/interspeech2010.html#MikolovKBCK10

work page 2010

[20] [20]

Lstm neural networks for language modeling,

M. Sundermeyer, R. Schl ¨uter, and H. Ney, “Lstm neural networks for language modeling,” in INTERSPEECH, 2012

work page 2012

[21] [21]

Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation

Y . Wu, M. Schuster, Z. Chen, Q. V . Le, M. Norouzi, W. Macherey, M. Krikun, Y . Cao, Q. Gao, K. Macherey et al. , “Google’s neural machine translation system: Bridging the gap between human and machine transla- tion,” arXiv preprint arXiv:1609.08144 , 2016

work page internal anchor Pith review Pith/arXiv arXiv 2016

[22] [22]

An empirical study of smoothing techniques for language modeling,

S. F. Chen and J. Goodman, “An empirical study of smoothing techniques for language modeling,” in Proceedings of the 34th Annual Meeting on Association for Computational Linguistics, ser. ACL ’96. Stroudsburg, PA, USA: Association for Computational Linguistics, 1996, pp. 310–318. [Online]. Available: https://doi.org/10.3115/981863.981904

work page doi:10.3115/981863.981904 1996

[23] [23]

Early-stage mal- ware prediction using recurrent neural networks,

M. Rhode, P. Burnap, and K. Jones, “Early-stage mal- ware prediction using recurrent neural networks,” com- puters & security , vol. 77, pp. 578–594, 2018

work page 2018

[24] [24]

LDA ensembles for interactive exploration and categorization of behaviors,

S. Chen, N. Andrienko, G. Andrienko, L. Adilova, J. Bar- let, J. Kindermann, P. H. Nguyen, O. Thonnard, and C. Turkay, “LDA ensembles for interactive exploration and categorization of behaviors,” IEEE Transactions on Visualization and Computer Graphics , 2019

work page 2019

[25] [25]

Latent dirichlet allocation,

D. M. Blei, A. Y . Ng, and M. I. Jordan, “Latent dirichlet allocation,” Journal of machine Learning research, vol. 3, no. Jan, pp. 993–1022, 2003

work page 2003

[26] [26]

Support vector method for novelty detection,

B. Sch ¨olkopf, R. C. Williamson, A. J. Smola, J. Shawe- Taylor, and J. C. Platt, “Support vector method for novelty detection,” in Advances in neural information processing systems, 2000, pp. 582–588

work page 2000

[27] [27]

Real-time computer network anomaly detection using machine learning techniques,

K. Limthong, “Real-time computer network anomaly detection using machine learning techniques,” Journal of Advances in Computer Networks , vol. 1, no. 1, 2013

work page 2013

[28] [28]

Dropout: a simple way to pre- vent neural networks from overﬁtting,

N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov, “Dropout: a simple way to pre- vent neural networks from overﬁtting,” The Journal of Machine Learning Research , vol. 15, no. 1, pp. 1929– 1958, 2014

work page 1929

[29] [29]

Generation of a new ids test dataset: Time to retire the kdd collection,

G. Creech and J. Hu, “Generation of a new ids test dataset: Time to retire the kdd collection,” in 2013 IEEE Wireless Communications and Networking Conference (WCNC). IEEE, 2013, pp. 4487–4492. VI. A PPENDIX Per cluster evaluation in terms of loss values achieved by models is shown in Figure 10. Fig. 10. Clusters are organized in ascending order by size. ...

work page 2013