Presents TRUST-Bench benchmark for hidden-trigger tool compromises in LLM agents and VISTA-Guard framework for trajectory-aware risk scoring of final actions under untrusted feedback.
Title resolution pending
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.CR 2years
2026 2representative citing papers
Stage-level tracking of prompt injection reveals that write-node placement and model-specific behaviors determine attack outcomes more than initial exposure in LLM pipelines.
citing papers explorer
-
Trust No Tool: Evaluating and Defending LLM Agents under Untrusted Tool Feedback
Presents TRUST-Bench benchmark for hidden-trigger tool compromises in LLM agents and VISTA-Guard framework for trajectory-aware risk scoring of final actions under untrusted feedback.
-
Kill-Chain Canaries: Stage-Level Tracking of Prompt Injection Across Attack Surfaces and Model Safety Tiers
Stage-level tracking of prompt injection reveals that write-node placement and model-specific behaviors determine attack outcomes more than initial exposure in LLM pipelines.