Can SOC Operators Explain their Decisions while Triaging Alarms? A Real-World Study
Pith reviewed 2026-05-09 21:00 UTC · model grok-4.3
The pith
SOC analysts correctly triage alarms 83 percent of the time but give accurate justifications only 39 percent of the time
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
SOC employees correctly determine whether alarms indicate true security problems in 83 percent of cases but provide explanations that reflect the actual root cause in only 39 percent of cases, based on a real-world field study with twelve analysts triaging alarms raised in their own SOC.
What carries the argument
The field study that presents real SOC alarms to analysts, records their binary decisions and free-text justifications, and compares those justifications against independently determined root causes.
If this is right
- Decision-support systems are needed that help SOC analysts both make the right triage call and articulate why it is correct.
- Training for SOC staff should address explanation skills in addition to detection accuracy.
- Automated security tools should incorporate features that prompt or generate justifications alongside detections.
- Future research should examine the reasons behind the observed disconnect between correct decisions and correct explanations.
Where Pith is reading between the lines
- The low rate of accurate explanations may arise because analysts often rely on tacit pattern recognition that is difficult to verbalize.
- Similar gaps between decision accuracy and explanation quality could exist in other high-stakes monitoring settings such as network operations or fraud detection.
- Structured prompts or templates for explanations could be tested to see whether they raise the rate of correct justifications without changing decision accuracy.
Load-bearing premise
The researchers can independently and objectively determine the actual root cause of each alarm to judge whether analysts' explanations are accurate.
What would settle it
A larger study with different SOC analysts and alarms finding that correct justifications occur in more than half the cases would challenge the reported gap.
Figures
read the original abstract
Security Operations Centers (SOCs) are pivotal in modern enterprises. Tasked to monitor complex network environments constantly under attack, SOCs can be active 24/7 and can include hundreds of operators supported by state-of-the-art technologies. Abundant research has studied the internal processes of SOCs, highlighting their pros and cons, as well as the challenges faced by SOC analysts -- such as dealing with the overwhelming number of false alarms triggered by automated security mechanisms. In this context, we wonder: given that "someone" must triage the alarms, and that such triaging must be grounded on established knowledge or evidence-based reasoning, can SOC employees justify why a certain decision was taken while triaging alarms? Answering such a research question (RQ) can better guide future efforts. We hence tackle this RQs. First, via a systematic literature review across 257 research documents, we provide evidence that such RQ received limited attention so far. Then, we partner-up with a real-world SOC and carry out a field study (n=12) with SOC employees. We show them real alarms raised in their SOC, and inquire whether such alarms are indicative of true security problems or not. Then, we ask to explain their decision. We found that while most analysts were able to separate "true from false" alarms (the decision was correct in 83% of the cases), a correct justification was hardly provided (only 39% of the provided explanations reflected the actual root cause). Ultimately, our results highlight the need for decision-support systems that help SOC analysts not only make the right call -- but also understand and articulate why it is right.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper addresses whether SOC operators can explain their triaging decisions for alarms. It conducts a systematic literature review of 257 documents indicating limited prior research on this question, followed by a real-world field study involving 12 SOC employees who were presented with actual alarms from their SOC. Participants were asked to determine if alarms indicated true security problems and to explain their reasoning. The study finds that decisions were correct in 83% of cases, but only 39% of explanations accurately reflected the actual root cause. The authors conclude that decision-support systems are needed to help analysts both decide correctly and articulate their reasoning.
Significance. If the empirical findings hold after methodological clarification, this work offers practical evidence from a real SOC partnership on the disconnect between correct alarm decisions and the ability to justify them. This has direct implications for analyst training, XAI tool development in security, and SOC process design. The field-study approach using live alarms is a strength that grounds the claims in operational reality, distinguishing it from purely synthetic or survey-based studies.
major comments (1)
- [Field study (methodology and results sections)] The central claim that correct justifications were provided in only 39% of cases rests on the researchers' independent determination of each alarm's 'actual root cause' as the scoring criterion. The manuscript provides no description of how this ground truth was established (e.g., via post-incident forensics, additional logs unavailable to analysts, multiple expert raters, or documented verification protocol). Without such details or inter-rater reliability measures, the 39% figure risks conflating analyst explanatory deficits with possible researcher-analyst interpretive differences. This directly undermines the headline contrast with the 83% correct-decision rate and requires explicit validation in a revision.
minor comments (2)
- [Abstract] The abstract and study description report percentages from n=12 without stating alarm selection criteria, how 'correct justification' was operationalized or scored, or any statistical tests applied to the results.
- [Literature review section] The literature review is summarized only at a high level (257 documents, limited attention); a brief table or paragraph outlining search terms, inclusion criteria, and main themes would improve transparency.
Simulated Author's Rebuttal
We thank the referee for their constructive feedback and for acknowledging the value of our real-world field study. We address the single major comment point by point below.
read point-by-point responses
-
Referee: [Field study (methodology and results sections)] The central claim that correct justifications were provided in only 39% of cases rests on the researchers' independent determination of each alarm's 'actual root cause' as the scoring criterion. The manuscript provides no description of how this ground truth was established (e.g., via post-incident forensics, additional logs unavailable to analysts, multiple expert raters, or documented verification protocol). Without such details or inter-rater reliability measures, the 39% figure risks conflating analyst explanatory deficits with possible researcher-analyst interpretive differences. This directly undermines the headline contrast with the 83% correct-decision rate and requires explicit validation in a revision.
Authors: We agree that the current manuscript does not provide adequate detail on how the actual root cause was determined for each alarm, and that this omission weakens the interpretability of the 39% figure. This is a valid methodological concern. In the revised manuscript we will add an explicit subsection in the methodology describing the ground-truth protocol: root causes were established from the SOC's internal incident-resolution records and supplementary logs that were unavailable to the participating analysts at triage time. Two researchers independently coded each explanation against these records; we will report the resulting inter-rater reliability (Cohen's kappa) and the resolution procedure for disagreements. These additions will allow readers to assess whether the observed gap reflects analyst limitations rather than coding differences. revision: yes
Circularity Check
No circularity: purely empirical observations from field study
full rationale
The paper reports results from a literature review (257 documents) and a field study (n=12 analysts triaging real alarms), yielding direct percentages (83% correct decisions, 39% correct justifications) based on observed responses versus assessed root causes. No equations, fitted parameters, self-referential definitions, or load-bearing self-citations exist that reduce any claim to its own inputs by construction. The central findings derive from external data collection and comparison, not internal construction or renaming of prior results. This is a standard empirical study with no derivation chain exhibiting the enumerated circularity patterns.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Researchers can determine the objective root cause of each alarm independently of the analysts' explanations
Reference graph
Works this paper leans on
-
[1]
Repository of this paper.https://github.com/hihey54/dimva26_soc
-
[2]
Directive (EU) 2016/1148 of the European Parliament and of the Council of 6 July 2016 concerning measures for a high common level of security of network and information systems across the Union. Official J. of the European Union (2016)
work page 2016
-
[3]
Commission Delegated Regulation (EU) 2024/1366 of 11 March 2024 supplement- ing Regulation 2019/943 of the European Parliament and of the Council (2024)
work page 2024
-
[4]
Acharya, B., Vadrevu, P.: A human in every ape: Delineating and evaluating the human analysis systems of anti-phishing entities. In: DIMVA (2022)
work page 2022
-
[5]
Alahmadi, B.A., Axon, L., Martinovic, I.: 99% false positives: A qualitative study of{SOC}analysts’ perspectives on security alarms. In: USENIX SEC (2022)
work page 2022
-
[6]
Apruzzese, G., Laskov, P., Schneider, J.: Sok: Pragmatic assessment of machine learning for network intrusion detection. In: IEEE EuroS&P (2023)
work page 2023
-
[7]
Araujo, I., Vieira, M.: Enhancing intrusion detection in containerized services: Assessing machine learning models and an advanced representation for system call data. Computers & Security (2025)
work page 2025
-
[8]
Bailey, M., Dittrich, D., Kenneally, E., Maughan, D.: The Menlo report. IEEE S&P (2012)
work page 2012
-
[9]
Burda, P., Allodi, L., Serebrenik, A., Zannone, N.: ’protect and fight back’: A case study on user motivations to report phishing emails. In: EuroUSEC (2024)
work page 2024
-
[10]
CheckPointResearch: (2025),https://blog.checkpoint.com/research/globa l-cyber-attacks-increase-in-november-2025-driven-by-ransomware-surge -and-genai-risks/
work page 2025
-
[11]
Cho, S.Y., Happa, J., Creese, S.: Capturing tacit knowledge in security operation centers. In: HAISA (2020)
work page 2020
-
[12]
Preventing chronic disease (2004)
Choi, B.C., Pak, A.W.: A catalog of biases in questionnaires. Preventing chronic disease (2004)
work page 2004
-
[13]
Chung, M.H.M.: Interactive Machine Learning in Cybersecurity: Using Human Expertise More Effectively. Ph.D. thesis, University of Toronto (Canada) (2023) Can SOC Operators Explain their Decisions while Triaging Alarms? 21
work page 2023
-
[14]
Clearnetwork: Security operations center best practices for small and medium en- terprises (smes) (2025),https://clearnetwork.com/security-operations-cen ter-best-practices-for-small-and-medium-enterprises-smes/
work page 2025
-
[15]
In: IEEE MetroInd4.0&IoT (2019)
Colelli, R., Panzieri, S., Pascucci, F.: Securing connection between it and ot: the fog intrusion detection system prospective. In: IEEE MetroInd4.0&IoT (2019)
work page 2019
-
[16]
Connor Desai, S., Reimers, S.: Comparing the use of open and closed questions for web-based measures of the continued-influence effect. Behavior Res. Meth. (2019)
work page 2019
-
[17]
DeepStrike: Top10 targeted countries for cyber attacks (2025),https://deepstri ke.io/blog/top-10-countries-most-targeted-by-cyber-attacks-in-2025
work page 2025
-
[18]
Eriksson, H.S., Grov, G.: Towards xai in the soc–a user centric study of explainable alerts with shap and lime. In: IEEE Big Data (2022)
work page 2022
-
[19]
EY: Cyberattacks and data theft (2024),https://www.ey.com/de_at/newsroom/ 2024/11/cyberangriffe-datendiebstahl, accessed: December 13, 2024
work page 2024
-
[20]
Federal Office for Information Security: (2024),https://bsi.bund.de/DE/Servi ce-Navi/Publikationen/Lagebericht/lagebericht_node.html
work page 2024
-
[21]
The hand- book of social work research methods (2001)
Franklin, C., Ballan, M.: Reliability and validity in qualitative research. The hand- book of social work research methods (2001)
work page 2001
-
[22]
Garneau, C.J., Erbacher, R.F., Etoty, R.E., Hutchinson, S.E.: Results and lessons learned from a user study of display effectiveness with experienced cyber security network analysts. In: LASER (2016)
work page 2016
-
[23]
Goodall, J.R.: Defending the network: Visualizing network traffic for intrusion detection analysis. Ph.D. thesis, University of Maryland, Baltimore County (2007)
work page 2007
-
[24]
Hagen, R.A., Øverlier, L., Helkala, K.: Human factors in ai-driven cybersecurity: Cognitive biases and trust issues. ACM DTRAP (2025)
work page 2025
-
[25]
Horstmann, S.A., Hong, S., Klein, D., Serafini, R., Degeling, M., Johns, M., Moon- samy, V., Naiakshina, A.: “Sorry for Bugging you so much.” Exploring Developers’ Behavior Towards Privacy-Compliant Implementation. In: IEEE S&P (2025)
work page 2025
-
[26]
International Organization for Standardization: Iso/iec 27001:2022 (2022)
work page 2022
-
[27]
Jansen, M., Bobba, R., Nevin, D.: A comparative analysis of difficulty between log and graph-based detection rule creation. In: WOSOC (2024)
work page 2024
-
[28]
I needed to solve their overwhelm- ness
Kaur, M., Parkin, S., Janssen, M., Fiebig, T.: “I needed to solve their overwhelm- ness”: How system administration work was affected by covid-19. CSCW (2022)
work page 2022
-
[29]
Kersten, L., Darr´ e, S., Mulders, T., Zambon, E., Caselli, M., Snijders, C., Allodi, L.: A security alert investigation tool supporting tier 1 analysts in contextualizing and understanding network security events. In: ACSAC (2024)
work page 2024
-
[30]
Khayat, M., Barka, E., Serhani, M.A., Sallabi, F., Shuaib, K., Khater, H.M.: Empowering security operation center with artificial intelligence and machine learning–a systematic literature review. IEEE Access (2025)
work page 2025
-
[31]
Kokulu, F.B., Soneji, A., Bao, T., Shoshitaishvili, Y., Zhao, Z., Doup´ e, A., Ahn, G.J.: Matched and mismatched socs: A qualitative study on security operations center issues. In: CCS (2019)
work page 2019
-
[32]
Kurogome, Y., Otsuki, Y., Kawakoya, Y., Iwamura, M., Hayashi, S., Mori, T., Sen, K.: Eiger: automated ioc generation for accurate and interpretable endpoint malware detection. In: ACSAC (2019)
work page 2019
-
[33]
Lipnicki, P., Lewandowski, D., Pareschi, D., Pakos, W., Ragaini, E.: Future of iotsp–it and ot integration. In: IEEE FiCloud (2018)
work page 2018
-
[34]
Liu, H., Zhong, C., Alnusair, A., Islam, S.R.: Faixid: A framework for enhancing ai explainability of intrusion detection results using data cleaning techniques. JNSM (2021)
work page 2021
-
[35]
IEEE TIFS (2024) 22 Jessica Moosmann, Irdin Pekaric, and Giovanni Apruzzese
Meschini, M., Di Tizio, G., Balduzzi, M., Massacci, F.: A case-control study to mea- sure behavioral risks of malware encounters in organizations. IEEE TIFS (2024) 22 Jessica Moosmann, Irdin Pekaric, and Giovanni Apruzzese
work page 2024
-
[36]
Mink, J., Benkraouda, H., Yang, L., Ciptadi, A., Ahmadzadeh, A., Votipka, D., Wang, G.: Everybody’s got ml, tell me what else you have: Practitioners’ perception of ml-based security tools and explanations. In: IEEE S&P (2023)
work page 2023
-
[37]
Moher, D., Shamseer, L., Clarke, M., Ghersi, D., Liberati, A., Petticrew, M., Shekelle, P., Stewart, L.A., Group, P.P.: Preferred reporting items for systematic review and meta-analysis protocols (prisma-p) statement. Systematic Rev. (2015)
work page 2015
-
[38]
Nadeem, A., Vos, D., Cao, C., Pajola, L., Dieck, S., Baumgartner, R., Verwer, S.: Sok: Explainable machine learning for computer security applications. In: IEEE EuroS&P (2023)
work page 2023
-
[39]
Naseer, A., Naseer, H., Ahmad, A., Maynard, S.B., Siddiqui, A.M.: Moving towards agile cybersecurity incident response: A case study exploring the enabling role of big data analytics-embedded dynamic capabilities. Computers & Security (2023)
work page 2023
-
[40]
Journal of Critical Infrastructure Studies (2024)
Ofte, H.J.: The awareness of operators: A goal-directed task analysis in socs for critical infrastructure. Journal of Critical Infrastructure Studies (2024)
work page 2024
-
[41]
Pekaric, I., Apruzzese, G.: ”We provide our resources in a dedicated repository”: Surveying the Transparency of HICSS publications. HICSS (2025)
work page 2025
-
[42]
Pike, M.F., Maior, H.A., Porcheron, M., Sharples, S.C., Wilson, M.L.: Measuring the effect of think aloud protocols on workload using fnirs. In: ACM CHI (2014)
work page 2014
-
[43]
Reeves, A., Ashenden, D.: Understanding decision making in security operations centres: building the case for cyber deception technology. Frontiers in Psych. (2023)
work page 2023
-
[44]
Reeves, A., Ashenden, D.: ‘it’s not paranoia if they’re really after you’: When announcing deception technology can change attacker decisions. In: HICSS (2025)
work page 2025
-
[45]
Roul, S.: Incident management in b2b payments: Challenges, frameworks, and emerging best practices. J. Comp. Sci. Tech. Stud. (2025)
work page 2025
-
[46]
Saha, A., Mattei, J., Blasco, J., Cavallaro, L., Votipka, D., Lindorfer, M.: Expert insights into advanced persistent threats: Analysis, attribution, and challenges. In: USENIX SEC (2025)
work page 2025
-
[47]
SANS: SOC Survey (2025), sans.org/white-papers/sans-2025-soc-survey
work page 2025
-
[48]
Schr¨ oer, S.L., Apruzzese, G., Human, S., Laskov, P., Anderson, H.S., Bernroider, E.W., Fass, A., Nassi, B., Rimmer, V., Roli, F., et al.: Sok: On the offensive po- tential of ai. In: IEEE SaTML (2025)
work page 2025
-
[49]
Singh, R., Tariq, S., Jalalvand, F., Chhetri, M.B., Nepal, S., Paris, C., Lochner, M.: Llms in the soc: An empirical study of human-ai collaboration in security operations centres. arXiv:2508.18947 (2025)
-
[50]
Sopan, A., Berninger, M., Mulakaluri, M., Katakam, R.: Building a machine learn- ing model for the soc, by the input from the soc, and analyzing it for the soc. In: IEEE VizSec (2018)
work page 2018
-
[51]
Splunk: Splunk enterprise security (2025),https://www.splunk.com/en_us/pro ducts/enterprise-security.html
work page 2025
-
[52]
Stevens, R., Votipka, D., Dykstra, J., Tomlinson, F., Quartararo, E., Ahern, C., Mazurek, M.L.: How ready is your ready? assessing the usability of incident re- sponse playbook frameworks. In: ACM CHI (2022)
work page 2022
-
[53]
Sundaramurthy, S.C., Case, J., Truong, T., Zomlot, L., Hoffmann, M.: A tale of three security operation centers. In: ACM SIWs Workshop (2014)
work page 2014
-
[54]
Swiss Cyber Institute: (2024),https://swisscyberinstitute.com/blog/key-i nsights-switzerland-latest-cybersecurity-report/
work page 2024
-
[55]
Tariq, S., Baruwal Chhetri, M., Nepal, S., Paris, C.: Alert fatigue in security op- erations centres: Research challenges and opportunities. ACM CSUR (2025)
work page 2025
-
[56]
In: AsiaCCS (2025) Can SOC Operators Explain their Decisions while Triaging Alarms? 23
Teuwen, K.T., Mulders, T., Zambon, E., Allodi, L.: Ruling the unruly: Designing effective, low-noise network intrusion detection rules for security operations centers. In: AsiaCCS (2025) Can SOC Operators Explain their Decisions while Triaging Alarms? 23
work page 2025
-
[57]
Computers in Human Behavior Reports (2024)
Thomson, R., Cassenti, D.N., Hawkins, T.: Too much of a good thing: How varying levels of automation impact user performance in a simulated intrusion detection task. Computers in Human Behavior Reports (2024)
work page 2024
-
[58]
Ulmer, A., Sessler, D., Kohlhammer, J.: Netcapvis: Web-based progressive visual analytics for network packet captures. In: IEEE VizSec (2019)
work page 2019
-
[59]
Van Ede, T., Aghakhani, H., Spahn, N., Bortolameotti, R., Cova, M., Continella, A., Van Steen, M., Peter, A., Kruegel, C., Vigna, G.: Deepcase: Semi-supervised contextual analysis of security events. In: IEEE S&P (2022)
work page 2022
-
[60]
Vermeer, M., Kadenko, N., van Eeten, M., Ga˜ n´ an, C., Parkin, S.: Alert alchemy: Soc workflows and decisions in the management of nids rules. In: CCS (2023)
work page 2023
-
[61]
Wohlin, C.: Guidelines for snowballing in systematic literature studies and a repli- cation in software engineering. In: EASE (2014)
work page 2014
-
[62]
WorldEconomicForum: Global cybersecurity outlook (2025),https://reports.we forum.org/docs/WEF_Global_Cybersecurity_Outlook_2025.pdf
work page 2025
-
[63]
Yang, L., Chen, Z., Wang, C., Zhang, Z., Booma, S., Cao, P., Adam, C., Withers, A., Kalbarczyk, Z., Iyer, R.K., et al.: True attacks, attack attempts, or benign trig- gers? an empirical measurement of network alerts in a security operations center. In: USENIX SEC (2024)
work page 2024
-
[64]
Yeke, D., Ibrahim, M., Tuncay, G.S., Farrukh, H., Imran, A., Bianchi, A., Celik, Z.B.: Wear’s my data? Understanding the cross-device runtime permission model in wearables. In: IEEE S&P (2024)
work page 2024
-
[65]
In: Theory and models for cyber situation awareness (2017)
Zhong, C., Yen, J., Liu, P., Erbacher, R.F., Garneau, C., Chen, B.: Studying ana- lysts’ data triage operations in cyber defense situational analysis. In: Theory and models for cyber situation awareness (2017)
work page 2017
-
[66]
Zhong, C., Yen, J., Liu, P., Erbacher, R.F.: Learning from experts’ experience: toward automated cyber security data triage. IEEE Systems Journal (2018)
work page 2018
-
[67]
Zimmerman, C.: Ten strategies of a world-class cybersecurity operations center. MITRE Technical Report (2014) A Additional Details A.1 Annex of the Systematic Literature Review The preliminary check to get an initial understanding of the state of the art was done qualitatively by two authors in Dec. 2024. At the time, we could not find any work that focus...
work page 2014
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.