NRT-Bench reports that adaptive multi-turn attacks cause critical safety function loss in 8.7-12.1% of sessions across four frontier LLM operator models, with nearly disjoint vulnerabilities and strongly model-dependent defense effects.
Title resolution pending
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2verdicts
UNVERDICTED 2representative citing papers
LLM planning agent with dynamic KG state achieves 81.5% accuracy on 200 multi-hop questions from NuScale FSAR documents, outperforming non-planning RAG baselines by up to 38pp.
citing papers explorer
-
NRT-Bench: Benchmarking Multi-Turn Red-Teaming of LLM Operator Agents in Safety-Critical Control Rooms
NRT-Bench reports that adaptive multi-turn attacks cause critical safety function loss in 8.7-12.1% of sessions across four frontier LLM operator models, with nearly disjoint vulnerabilities and strongly model-dependent defense effects.
-
LLM-Guided Planning for Multi-hop Reasoning over Multimodal Nuclear Regulatory Documents
LLM planning agent with dynamic KG state achieves 81.5% accuracy on 200 multi-hop questions from NuScale FSAR documents, outperforming non-planning RAG baselines by up to 38pp.