hub

SecGPT: An Execution Isolation Architecture for LLM-Based Systems

Yuhao Wu, Franziska Roesner, Tadayoshi Kohno, Ning Zhang, Umar Iqbal · 2024 · arXiv 2403.04960

15 Pith papers cite this work. Polarity classification is still indexing.

15 Pith papers citing it

read on arXiv browse 15 citing papers

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 2 dataset 1 method 1

citation-polarity summary

background 2 use dataset 1 use method 1

representative citing papers

AgentDojo: A Dynamic Environment to Evaluate Prompt Injection Attacks and Defenses for LLM Agents

cs.CR · 2024-06-19 · unverdicted · novelty 8.0

AgentDojo introduces an extensible evaluation framework populated with realistic agent tasks and security test cases to measure prompt injection robustness in tool-using LLM agents.

AgenTEE: Confidential LLM Agent Execution on Edge Devices

cs.CR · 2026-04-20 · unverdicted · novelty 7.0

AgenTEE isolates LLM agent runtime, inference, and apps in independently attested cVMs on Arm-based edge devices, achieving under 5.15% overhead versus commodity OS deployments.

From Standalone LLMs to Integrated Intelligence: A Survey of Compound Al Systems

cs.MA · 2025-06-05 · accept · novelty 7.0

A survey that defines Compound AI Systems, proposes a multi-dimensional taxonomy based on component roles and orchestration strategies, reviews four foundational paradigms, and identifies key challenges for future research.

Aligning Provenance with Authorization: A Dual-Graph Defense for LLM Agents

cs.CR · 2026-05-26 · unverdicted · novelty 6.0

AuthGraph aligns an execution provenance graph with a clean authorization graph to detect parameter-source deviations from user intent, reducing attack success rates to 1-2% on AgentDojo and AgentDyn while retaining most task utility.

PrivScope: Task-scoped Disclosure Control for Hybrid Agentic Systems

cs.CR · 2026-05-15 · unverdicted · novelty 6.0

PrivScope enforces task-scoped disclosure at the local-cloud boundary in hybrid agents, eliminating profile leakage and halving re-identification risk on medical workflows while preserving task success.

Behavioral Integrity Verification for AI Agent Skills

cs.CR · 2026-05-12 · unverdicted · novelty 6.0

BIV audits AI agent skills at scale, finding 80% deviate from declared behavior on 49,943 skills and achieving 0.946 F1 for malicious skill detection.

ARGUS: Defending LLM Agents Against Context-Aware Prompt Injection

cs.CR · 2026-05-05 · unverdicted · novelty 6.0

ARGUS defends LLM agents from context-aware prompt injections by tracking information provenance and verifying decisions against trustworthy evidence, reducing attack success to 3.8% while retaining 87.5% task utility.

Semia: Auditing Agent Skills via Constraint-Guided Representation Synthesis

cs.CR · 2026-05-01 · unverdicted · novelty 6.0

Semia synthesizes Datalog representations of agent skills via constraint-guided loops to enable reachability queries for semantic risks, finding critical issues in over half of 13,728 real skills with 97.7% recall on expert-labeled samples.

Parallax: Why AI Agents That Think Must Never Act

cs.CR · 2026-04-14 · unverdicted · novelty 6.0

Parallax enforces structural separation between AI thinking and acting via independent multi-tier validation, information flow control, and state rollback, blocking 98.9% of 280 adversarial attacks with zero false positives even when the reasoning system is fully compromised.

CAEC: Confidential, Attestable, and Efficient Inter-CVM Communication with Arm CCA

cs.CR · 2025-12-01 · unverdicted · novelty 6.0

CAEC adds confidential shared memory to Arm CCA, cutting inter-CVM communication cost by up to 209x versus encryption through hypervisor-visible memory while preserving isolation and adding attestable sharing.

Whispers in the Machine: Confidentiality in Agentic Systems

cs.CR · 2024-02-10 · unverdicted · novelty 6.0

Systematic testing of ten LLM agents across 20 tool scenarios and 14 attacks finds universal vulnerability to prompt injection enabling data exfiltration, with tooling amplifying leakage.

Options, Not Clicks: Lattice Refinement for Consent-Driven MCP Authorization

cs.CR · 2026-05-12 · unverdicted · novelty 5.0

Conleash uses a risk lattice, policy engine, and refinement loop to deliver scoped, consent-driven authorization for MCP tool calls, reaching 98.2% accuracy and 99.4% escalation catch rate on 984 traces with 8.2 ms overhead and higher user preference in a 16-person study.

ClawLess: A Security Model of AI Agents

cs.CR · 2026-04-07 · unverdicted · novelty 5.0

ClawLess introduces a formal fine-grained security model for AI agents with runtime-adaptive policies enforced via user-space kernel and BPF syscall interception.

AgentGuard: An Attribute-Based Access Control Framework for Tool-Use LLM-Based Agent

cs.CR · 2026-05-27 · unverdicted · novelty 4.0

AgentGuard is an ABAC framework for tool-use LLM agents with lightweight client integration and three server-side inspection mechanisms for single-tool and cross-tool risks.

From Cool Demos to Production-Ready FMware: Core Challenges and a Technology Roadmap

cs.SE · 2024-10-28 · unverdicted · novelty 4.0

A semi-structured thematic synthesis identifies core challenges in FM selection, alignment, prompting, orchestration, testing, deployment, and cross-cutting concerns like observability for production-ready FMware.

citing papers explorer

Showing 10 of 10 citing papers after filters.

AgenTEE: Confidential LLM Agent Execution on Edge Devices cs.CR · 2026-04-20 · unverdicted · none · ref 58
AgenTEE isolates LLM agent runtime, inference, and apps in independently attested cVMs on Arm-based edge devices, achieving under 5.15% overhead versus commodity OS deployments.
Aligning Provenance with Authorization: A Dual-Graph Defense for LLM Agents cs.CR · 2026-05-26 · unverdicted · none · ref 23
AuthGraph aligns an execution provenance graph with a clean authorization graph to detect parameter-source deviations from user intent, reducing attack success rates to 1-2% on AgentDojo and AgentDyn while retaining most task utility.
PrivScope: Task-scoped Disclosure Control for Hybrid Agentic Systems cs.CR · 2026-05-15 · unverdicted · none · ref 27
PrivScope enforces task-scoped disclosure at the local-cloud boundary in hybrid agents, eliminating profile leakage and halving re-identification risk on medical workflows while preserving task success.
Behavioral Integrity Verification for AI Agent Skills cs.CR · 2026-05-12 · unverdicted · none · ref 37
BIV audits AI agent skills at scale, finding 80% deviate from declared behavior on 49,943 skills and achieving 0.946 F1 for malicious skill detection.
ARGUS: Defending LLM Agents Against Context-Aware Prompt Injection cs.CR · 2026-05-05 · unverdicted · none · ref 149
ARGUS defends LLM agents from context-aware prompt injections by tracking information provenance and verifying decisions against trustworthy evidence, reducing attack success to 3.8% while retaining 87.5% task utility.
Semia: Auditing Agent Skills via Constraint-Guided Representation Synthesis cs.CR · 2026-05-01 · unverdicted · none · ref 44
Semia synthesizes Datalog representations of agent skills via constraint-guided loops to enable reachability queries for semantic risks, finding critical issues in over half of 13,728 real skills with 97.7% recall on expert-labeled samples.
Parallax: Why AI Agents That Think Must Never Act cs.CR · 2026-04-14 · unverdicted · none · ref 49
Parallax enforces structural separation between AI thinking and acting via independent multi-tier validation, information flow control, and state rollback, blocking 98.9% of 280 adversarial attacks with zero false positives even when the reasoning system is fully compromised.
Options, Not Clicks: Lattice Refinement for Consent-Driven MCP Authorization cs.CR · 2026-05-12 · unverdicted · none · ref 54
Conleash uses a risk lattice, policy engine, and refinement loop to deliver scoped, consent-driven authorization for MCP tool calls, reaching 98.2% accuracy and 99.4% escalation catch rate on 984 traces with 8.2 ms overhead and higher user preference in a 16-person study.
ClawLess: A Security Model of AI Agents cs.CR · 2026-04-07 · unverdicted · none · ref 19
ClawLess introduces a formal fine-grained security model for AI agents with runtime-adaptive policies enforced via user-space kernel and BPF syscall interception.
AgentGuard: An Attribute-Based Access Control Framework for Tool-Use LLM-Based Agent cs.CR · 2026-05-27 · unverdicted · none · ref 22
AgentGuard is an ABAC framework for tool-use LLM agents with lightweight client integration and three server-side inspection mechanisms for single-tool and cross-tool risks.

SecGPT: An Execution Isolation Architecture for LLM-Based Systems

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer