CAI Dataset is presented as the largest described corpus of LLM-driven hacker trajectories, with the claim that operator data concentration in frontier-model providers creates a major security risk best addressed by on-premise specialized LLMs.
Toward cybersecurity-expert small language models
4 Pith papers cite this work. Polarity classification is still indexing.
years
2026 4verdicts
UNVERDICTED 4representative citing papers
Dynamic Cyber Ranges with LLM defender agents reduce attacker success to 0-55% and preserve evaluation headroom as models advance by using comparable capabilities on both sides.
MinervaRL applies reinforcement learning with verifiable rewards from CTI standards to improve LLM structured output performance by 15.8 points over base models across 12 benchmarks.
A hybrid LLM-RL red teaming framework generates adaptive attack campaigns in simulated enterprise networks to evaluate the robustness of AI-enabled SOAR systems.
citing papers explorer
-
Cybersecurity AI (CAI) Dataset
CAI Dataset is presented as the largest described corpus of LLM-driven hacker trajectories, with the claim that operator data concentration in frontier-model providers creates a major security risk best addressed by on-premise specialized LLMs.
-
Dynamic Cyber Ranges
Dynamic Cyber Ranges with LLM defender agents reduce attacker success to 0-55% and preserve evaluation headroom as models advance by using comparable capabilities on both sides.
-
A Red Teaming Framework for Evaluating Robustness of AI-enabled Security Orchestration, Automation, and Response Systems
A hybrid LLM-RL red teaming framework generates adaptive attack campaigns in simulated enterprise networks to evaluate the robustness of AI-enabled SOAR systems.