NuHF Claw: A Risk Constrained Cognitive Agent Framework for Human Centered Procedure Support in Digital Nuclear Control Rooms

Haitao Wang; Jiejuan Tong; Jingang Liang; Jun Sun; Peng Chen; Xingyu Xiao; Zhe Sui

arxiv: 2604.14160 · v1 · submitted 2026-03-23 · 💻 cs.AI

NuHF Claw: A Risk Constrained Cognitive Agent Framework for Human Centered Procedure Support in Digital Nuclear Control Rooms

Xingyu Xiao , Jiejuan Tong , Jun Sun , Zhe Sui , Peng Chen , Jingang Liang , Haitao Wang This is my paper

Pith reviewed 2026-05-15 01:22 UTC · model grok-4.3

classification 💻 cs.AI

keywords cognitive agent frameworknuclear control roomshuman reliability analysisrisk constrained autonomycognitive state inferencedigital operationshuman error probabilitysituational awareness

0 comments

The pith

NuHF Claw is a cognitive agent framework that uses real-time risk constraints to support operators in digital nuclear control rooms.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces NuHF Claw to address elevated cognitive risks in digitized nuclear power plant control rooms where existing human reliability methods fall short. It proposes a persistent agent that couples cognitive state inference from operator interactions with probabilistic safety assessments to regulate autonomous behaviors. This transforms offline analysis into real-time interventions that anticipate degradation and constrain unsafe suggestions. The approach aims to enable safe use of intelligent agents while maintaining human decision authority. Simulator experiments validate its ability to provide risk-aware guidance.

Core claim

NuHF Claw introduces a risk constrained agent runtime that tightly couples cognitive state inference with probabilistic safety assessment, integrating workload and situational awareness estimation with dynamic human error probability prediction to enable proactive, cognition-aware autonomy in nuclear operations.

What carries the argument

The risk constrained agent runtime, which regulates autonomous system behavior in real time by integrating cognitively grounded workload and situational awareness estimates with dynamic human error probability predictions.

Load-bearing premise

Cognitive states can be inferred accurately and in real time from interface interactions, allowing dynamic prediction of human error probabilities without introducing new failure modes or requiring extensive operator-specific data.

What would settle it

Demonstration in the simulator that the framework either misses a cognitive degradation event leading to operator error or incorrectly blocks a safe action more often than a non-constrained agent.

Figures

Figures reproduced from arXiv: 2604.14160 by Haitao Wang, Jiejuan Tong, Jingang Liang, Jun Sun, Peng Chen, Xingyu Xiao, Zhe Sui.

**Figure 2.** Figure 2: Reactor Shutdown Procedure Mapped onto the Interface-Element Knowledge Graph (IE-KG) Xiao et al.: Preprint submitted to Elsevier Page 13 of 12 [PITH_FULL_IMAGE:figures/full_fig_p013_2.png] view at source ↗

**Figure 3.** Figure 3: Students Conducting Experiments on the HTR-PM600 1:1 Full-Scope Simulator [PITH_FULL_IMAGE:figures/full_fig_p014_3.png] view at source ↗

**Figure 4.** Figure 4: Interface-Element Knowledge Graph (IE-KG) Derived from the HTR-PM600 Interface Xiao et al.: Preprint submitted to Elsevier Page 14 of 12 [PITH_FULL_IMAGE:figures/full_fig_p014_4.png] view at source ↗

**Figure 5.** Figure 5: Interface-Element Knowledge Graph (IE-KG) Derived from the HTR-PM600 Interface [PITH_FULL_IMAGE:figures/full_fig_p015_5.png] view at source ↗

**Figure 6.** Figure 6: AutoGraph Mapping of Textual Procedures to Interface Coordinates Xiao et al.: Preprint submitted to Elsevier Page 15 of 12 [PITH_FULL_IMAGE:figures/full_fig_p015_6.png] view at source ↗

**Figure 7.** Figure 7: Complete Executable Graph Representation for Reactor Shutdown Xiao et al.: Preprint submitted to Elsevier Page 16 of 12 [PITH_FULL_IMAGE:figures/full_fig_p016_7.png] view at source ↗

read the original abstract

The rapid digitization of nuclear power plant main control rooms has fundamentally reshaped operator interaction patterns, introducing complex soft-control behaviors and elevated cognitive risks that are not adequately addressed by existing human reliability analysis approaches. Although recent advances in large language models and autonomous agents offer new opportunities for intelligent decision support, their deployment in safety critical environments remains constrained by risks of hallucinated reasoning and weakened human authority. This study proposes NuHF Claw, a persistent cognitive-risk agent framework that enables risk governed human centered autonomy for digital nuclear operations. The core methodological innovation lies in the introduction of a risk constrained agent runtime, which tightly couples cognitive state inference with probabilistic safety assessment to regulate autonomous system behavior in real time. By integrating cognitively grounded workload and situational awareness estimation with dynamic human error probability prediction, the framework transforms conventional offline reliability analysis into a proactive intervention mechanism embedded directly within operational workflows. Experimental validation on a high-fidelity digital control room simulator demonstrates that NuHF Claw can anticipate interface induced cognitive degradation, dynamically constrain unsafe autonomous recommendations, and provide risk-aware navigational guidance while preserving human decision authority. The results highlight a fundamental shift from automation-driven operation toward cognition-aware autonomy, offering a principled pathway for the safe integration of intelligent agents into next-generation nuclear control environments.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 0 minor

Summary. The paper proposes NuHF Claw, a persistent cognitive-risk agent framework for human-centered procedure support in digital nuclear control rooms. Its core innovation is a risk-constrained agent runtime that couples cognitive state inference (workload and situational awareness) with dynamic human error probability prediction to regulate autonomous recommendations in real time. The manuscript claims that experimental validation on a high-fidelity digital control room simulator demonstrates the system's ability to anticipate interface-induced cognitive degradation, constrain unsafe autonomous actions, and deliver risk-aware navigational guidance while preserving human decision authority.

Significance. If the claimed real-time integration of interface-derived cognitive inference and dynamic HEP prediction can be shown to operate accurately without eroding operator authority or introducing new failure modes, the work would represent a meaningful step toward cognition-aware autonomy in safety-critical nuclear environments. The shift from offline HRA to embedded proactive intervention is conceptually valuable; however, the complete absence of quantitative metrics, baselines, latency data, or model details currently prevents any assessment of whether the central claims hold.

major comments (2)

[Abstract] Abstract: the central performance claim that 'experimental validation on a high-fidelity digital control room simulator demonstrates that NuHF Claw can anticipate interface induced cognitive degradation, dynamically constrain unsafe autonomous recommendations, and provide risk-aware navigational guidance' is unsupported by any quantitative results, error bars, baseline comparisons, latency figures, or description of the cognitive inference method.
The manuscript supplies no equations, model architecture, training regime, or per-operator calibration protocol for the claimed integration of cognitive state inference with dynamic human error probability prediction, leaving the real-time accuracy and safety of the risk-constrained runtime unassessable.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback. We agree that the current manuscript version does not supply the quantitative results, model details, equations, or calibration information needed to substantiate the abstract claims or allow assessment of the risk-constrained runtime. We will perform a major revision to add these elements.

read point-by-point responses

Referee: [Abstract] Abstract: the central performance claim that 'experimental validation on a high-fidelity digital control room simulator demonstrates that NuHF Claw can anticipate interface induced cognitive degradation, dynamically constrain unsafe autonomous recommendations, and provide risk-aware navigational guidance' is unsupported by any quantitative results, error bars, baseline comparisons, latency figures, or description of the cognitive inference method.

Authors: We acknowledge that the abstract states strong performance claims without accompanying quantitative support in the submitted version. In the revised manuscript we will update the abstract to include key quantitative findings from the simulator experiments (prediction accuracy for cognitive degradation, percentage reduction in unsafe autonomous actions, average recommendation latency, and baseline comparisons against non-risk-constrained agents) together with error bars. We will also insert a concise description of the cognitive inference method (workload and situational awareness estimation) directly into the abstract. revision: yes
Referee: The manuscript supplies no equations, model architecture, training regime, or per-operator calibration protocol for the claimed integration of cognitive state inference with dynamic human error probability prediction, leaving the real-time accuracy and safety of the risk-constrained runtime unassessable.

Authors: We agree that the submitted manuscript omits the requested technical details. The revised version will contain a new dedicated methods subsection that presents: (1) the mathematical equations for cognitive state inference and its coupling to dynamic HEP prediction, (2) the agent architecture diagram and runtime flow, (3) the training regime and data sources from the high-fidelity simulator, and (4) the per-operator calibration protocol. These additions will enable evaluation of real-time accuracy and safety properties. revision: yes

Circularity Check

0 steps flagged

No circularity: framework description contains no equations or derivations

full rationale

The manuscript introduces NuHF Claw as a high-level agent framework that couples cognitive state inference with dynamic human error probability prediction inside a risk-constrained runtime. No equations, model architectures, fitting procedures, or derivation steps are supplied in the abstract or described text. Consequently, no claimed prediction can be shown to reduce by construction to a fitted parameter, self-citation, or renamed input. The experimental validation claim is presented as an empirical outcome rather than a mathematical consequence of prior definitions, leaving the derivation chain empty and the circularity score at zero.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 1 invented entities

Only the abstract is available, so the ledger records the main unstated premises required for the framework to function as described.

axioms (2)

domain assumption Cognitive workload and situational awareness can be inferred in real time from operator interface interactions with sufficient accuracy for safety decisions.
Invoked when the framework claims to anticipate cognitive degradation and regulate autonomy.
domain assumption Probabilistic safety assessment models can be updated dynamically from live cognitive state estimates without introducing unacceptable latency or error.
Required for the 'risk constrained agent runtime' to operate inside operational workflows.

invented entities (1)

risk constrained agent runtime no independent evidence
purpose: Tightly couples cognitive inference with safety assessment to regulate autonomous behavior in real time.
New named component introduced to enforce human-centered constraints; no independent falsifiable evidence supplied beyond the claim of simulator success.

pith-pipeline@v0.9.0 · 5539 in / 1508 out tokens · 68082 ms · 2026-05-15T01:22:59.623523+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

risk-constrained agent runtime... Cognitive Twin Agent (ACT-R)... Dynamic Risk Agent (KRAIL)... Governance Safety Gate (Bayesian Network)
IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

dynamic human error probability prediction... IDHEAS-ECA PIF analysis

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

8 extracted references · 8 canonical work pages

[1]

Goms-hra: A method for treating subtasks in dynamic human reliability analysis, in: Proceedings of the 2016 European Safety and Reliability Conference, pp. 956–963. Boring, R., Ulrich, T., Park, J., Heo, Y., Ahn, J.,

work page 2016
[2]

Human reliability analysis for digital human-machine interfaces: a wish list for future research, in: PSAM12, Probabilistic Safety Assessment and Management Conference, Honolulu Hawaii, USA. pp. 22–27. Boring,R.L.,Ulrich,T.A.,Lew,R.,2023. Theprocedureperformancepredictor(p3):Applicationofthehunterdynamichumanreliabilityanalysis software to inform the deve...

work page 2023
[3]

US NRC, RIL 2024-17

Integrated human event analysis system for event and condition assessment (idheas-eca) evaluations of standardized plant analysis risk (spar) model human failure events. US NRC, RIL 2024-17 . Kim, Y., Kim, J., Park, J., Choi, S., Kim, H.,

work page 2024
[4]

Korea At

An hra method for digital main control rooms-part ii: Estimating the failure probability due to cognitive error. Korea At. Energy Research Inst., Daejeon, Republic of Korea, Tech. Rep. . Kumar,K.P.,Swarubini,P.,Ganapathy,N.,2025. Cognitiveartificialintelligence,in:Artificialintelligenceandbiologicalsciences.CRCPress,pp. 301–323. Lebiere, C.,

work page 2025
[5]

Current Opinion in Behavioral Sciences 38, 29–39

Multi-step planning in the brain. Current Opinion in Behavioral Sciences 38, 29–39. Xiao et al.:Preprint submitted to ElsevierPage 11 of 12 NuHF-Claw for Cognitive-Risk Procedure Support Park,J.,Lee,D.,Jung,W.,Kim,J.,2017. Anexperimentalinvestigationonrelationshipbetweenpsfsandoperatorperformancesinthedigitalmain control room. Annals of Nuclear Energy 101...

work page 2017
[6]

Division of Systems Analysis and Regulatory Effectiveness, Office of Nuclear

Soft controls: technical basis and human factors review guidance. Division of Systems Analysis and Regulatory Effectiveness, Office of Nuclear .... Taatgen,N.A.,Lebiere,C.,Anderson,J.R.,2006. Modelingparadigmsinact-r. Cognitionandmulti-agentinteraction:Fromcognitivemodelingto social simulation , 29–52. Wang, L., Ma, C., Feng, X., Zhang, Z., Yang, H., Zhan...

work page 2006
[7]

Computers & Industrial Engineering , 111807

An intelligent framework for automated human reliability data generation in complex industrial systems. Computers & Industrial Engineering , 111807. Xiao,X.,Chen,P.,Liang,J.,Tong,J.,Wang,H.,2025a. Integratinglargelanguagemodelswithretrieval-augmentedgenerationfordecisionsupport in idheas-eca applications, in: International Conference on Nuclear Engineerin...

work page arXiv
[8]

US NRC,(IDHEAS-ECA), NUREG-2256

Integrated human event analysis system for event and condition assessment. US NRC,(IDHEAS-ECA), NUREG-2256 . Yang,D.,Liu,T.,Zhang,D.,Simoulin,A.,Liu,X.,Cao,Y.,Teng,Z.,Qian,X.,Yang,G.,Luo,J.,etal.,2025. Codetothink,thinktocode:Asurvey on code-enhanced reasoning and reasoning-driven code intelligence in llms, in: Proceedings of the 2025 Conference on Empiri...

work page 2025

[1] [1]

Goms-hra: A method for treating subtasks in dynamic human reliability analysis, in: Proceedings of the 2016 European Safety and Reliability Conference, pp. 956–963. Boring, R., Ulrich, T., Park, J., Heo, Y., Ahn, J.,

work page 2016

[2] [2]

Human reliability analysis for digital human-machine interfaces: a wish list for future research, in: PSAM12, Probabilistic Safety Assessment and Management Conference, Honolulu Hawaii, USA. pp. 22–27. Boring,R.L.,Ulrich,T.A.,Lew,R.,2023. Theprocedureperformancepredictor(p3):Applicationofthehunterdynamichumanreliabilityanalysis software to inform the deve...

work page 2023

[3] [3]

US NRC, RIL 2024-17

Integrated human event analysis system for event and condition assessment (idheas-eca) evaluations of standardized plant analysis risk (spar) model human failure events. US NRC, RIL 2024-17 . Kim, Y., Kim, J., Park, J., Choi, S., Kim, H.,

work page 2024

[4] [4]

Korea At

An hra method for digital main control rooms-part ii: Estimating the failure probability due to cognitive error. Korea At. Energy Research Inst., Daejeon, Republic of Korea, Tech. Rep. . Kumar,K.P.,Swarubini,P.,Ganapathy,N.,2025. Cognitiveartificialintelligence,in:Artificialintelligenceandbiologicalsciences.CRCPress,pp. 301–323. Lebiere, C.,

work page 2025

[5] [5]

Current Opinion in Behavioral Sciences 38, 29–39

Multi-step planning in the brain. Current Opinion in Behavioral Sciences 38, 29–39. Xiao et al.:Preprint submitted to ElsevierPage 11 of 12 NuHF-Claw for Cognitive-Risk Procedure Support Park,J.,Lee,D.,Jung,W.,Kim,J.,2017. Anexperimentalinvestigationonrelationshipbetweenpsfsandoperatorperformancesinthedigitalmain control room. Annals of Nuclear Energy 101...

work page 2017

[6] [6]

Division of Systems Analysis and Regulatory Effectiveness, Office of Nuclear

Soft controls: technical basis and human factors review guidance. Division of Systems Analysis and Regulatory Effectiveness, Office of Nuclear .... Taatgen,N.A.,Lebiere,C.,Anderson,J.R.,2006. Modelingparadigmsinact-r. Cognitionandmulti-agentinteraction:Fromcognitivemodelingto social simulation , 29–52. Wang, L., Ma, C., Feng, X., Zhang, Z., Yang, H., Zhan...

work page 2006

[7] [7]

Computers & Industrial Engineering , 111807

An intelligent framework for automated human reliability data generation in complex industrial systems. Computers & Industrial Engineering , 111807. Xiao,X.,Chen,P.,Liang,J.,Tong,J.,Wang,H.,2025a. Integratinglargelanguagemodelswithretrieval-augmentedgenerationfordecisionsupport in idheas-eca applications, in: International Conference on Nuclear Engineerin...

work page arXiv

[8] [8]

US NRC,(IDHEAS-ECA), NUREG-2256

Integrated human event analysis system for event and condition assessment. US NRC,(IDHEAS-ECA), NUREG-2256 . Yang,D.,Liu,T.,Zhang,D.,Simoulin,A.,Liu,X.,Cao,Y.,Teng,Z.,Qian,X.,Yang,G.,Luo,J.,etal.,2025. Codetothink,thinktocode:Asurvey on code-enhanced reasoning and reasoning-driven code intelligence in llms, in: Proceedings of the 2025 Conference on Empiri...

work page 2025