pith. sign in

Safety under scaffolding: How evaluation conditions shape measured safety

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

citation-role summary

background 1

citation-polarity summary

fields

cs.CR 1

years

2026 1

verdicts

CONDITIONAL 1

roles

background 1

polarities

background 1

representative citing papers

citing papers explorer

Showing 1 of 1 citing paper.

  • LITMUS: Benchmarking Behavioral Jailbreaks of LLM Agents in Real OS Environments cs.CR · 2026-05-11 · conditional · none · ref 17 · internal anchor

    LITMUS is the first benchmark using semantic-physical dual verification and OS state rollback to measure behavioral jailbreaks in LLM agents, revealing that even strong models execute 40%+ of high-risk operations and exhibit execution hallucination.