pith. sign in

Solver-Aided Verification of Policy Compliance in Tool-Augmented

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

fields

cs.CR 2

years

2026 2

verdicts

UNVERDICTED 2

representative citing papers

Owner-Harm: A Missing Threat Model for AI Agent Safety

cs.CR · 2026-04-20 · unverdicted · novelty 6.0

Owner-Harm is a new threat model with eight categories of agent behavior that harms the deployer, and existing defenses achieve only 14.8% true positive rate on injection-based owner-harm tasks versus 100% on generic criminal harm.

citing papers explorer

Showing 2 of 2 citing papers.