FragBench demonstrates that cross-session fragmented LLM attacks evade single-turn safety judges but can be detected at F1 0.88-0.96 using graph neural networks on user interaction graphs.
Model context protocol.https://modelcontextprotocol.io/
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CR 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
FragBench: Cross-Session Attacks Hidden in Benign-Looking Fragments
FragBench demonstrates that cross-session fragmented LLM attacks evade single-turn safety judges but can be detected at F1 0.88-0.96 using graph neural networks on user interaction graphs.