CI-Bench: Benchmarking contextual integrity of ai assistants on synthetic data

Zhao Cheng, Diane Wan, Matthew Abueg, Sahra Ghalebikesabi, Ren Yi, Eugene Bagdasarian, Borja Balle, Stefan Mellem, Shawn O’Banion · 2024 · arXiv 2409.13903

7 Pith papers cite this work. Polarity classification is still indexing.

7 Pith papers citing it

read on arXiv browse 7 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

Remembering More, Risking More: Longitudinal Safety Risks in Memory-Equipped LLM Agents

cs.AI · 2026-05-18 · unverdicted · novelty 6.0

Memory-equipped LLM agents exhibit increasing safety violation rates as memory accumulates across independent tasks, termed temporal memory contamination, detected via a new trigger-probe protocol.

PrivScope: Task-scoped Disclosure Control for Hybrid Agentic Systems

cs.CR · 2026-05-15 · unverdicted · novelty 6.0

PrivScope enforces task-scoped disclosure at the local-cloud boundary in hybrid agents, eliminating profile leakage and halving re-identification risk on medical workflows while preserving task success.

CI-Work: Benchmarking Contextual Integrity in Enterprise LLM Agents

cs.CR · 2026-04-23 · unverdicted · novelty 6.0

A new benchmark shows enterprise LLM agents violate contextual integrity at rates of 15.8-50.9% with leakage up to 26.7%, and higher task performance correlates with more privacy breaches that model scaling does not fix.

ContextLens: Modeling Imperfect Privacy and Safety Context for Legal Compliance

cs.CL · 2026-04-14 · unverdicted · novelty 6.0

ContextLens improves LLM compliance assessment for GDPR and EU AI Act by grounding imperfect contexts through targeted questions on applicability, principles, and provisions while identifying missing factors, without any training.

Can Large Language Models Really Recognize Your Name?

cs.CR · 2025-05-20 · unverdicted · novelty 6.0

LLMs exhibit 20-40% lower recall on ambiguous human names for PII detection, worsening under prompt injections, as shown via the new AmBench benchmark.

It Takes Two: Complementary Self-Distillation for Contextual Integrity in LLMs

cs.LG · 2026-05-18 · unverdicted · novelty 4.0

SELFCI uses complementary self-distillation with two reverse KL divergences to align LLMs to contextual integrity while preserving utility, outperforming RL baselines like GRPO in agentic settings.

Reinforcement Learning for Scalable and Trustworthy Intelligent Systems

cs.LG · 2026-05-08 · unverdicted · novelty 3.0

Reinforcement learning is advanced for communication-efficient federated optimization and for preference-aligned, contextually safe policies in large language models.

citing papers explorer

Showing 7 of 7 citing papers.

Remembering More, Risking More: Longitudinal Safety Risks in Memory-Equipped LLM Agents cs.AI · 2026-05-18 · unverdicted · none · ref 22
Memory-equipped LLM agents exhibit increasing safety violation rates as memory accumulates across independent tasks, termed temporal memory contamination, detected via a new trigger-probe protocol.
PrivScope: Task-scoped Disclosure Control for Hybrid Agentic Systems cs.CR · 2026-05-15 · unverdicted · none · ref 38
PrivScope enforces task-scoped disclosure at the local-cloud boundary in hybrid agents, eliminating profile leakage and halving re-identification risk on medical workflows while preserving task success.
CI-Work: Benchmarking Contextual Integrity in Enterprise LLM Agents cs.CR · 2026-04-23 · unverdicted · none · ref 1
A new benchmark shows enterprise LLM agents violate contextual integrity at rates of 15.8-50.9% with leakage up to 26.7%, and higher task performance correlates with more privacy breaches that model scaling does not fix.
ContextLens: Modeling Imperfect Privacy and Safety Context for Legal Compliance cs.CL · 2026-04-14 · unverdicted · none · ref 3
ContextLens improves LLM compliance assessment for GDPR and EU AI Act by grounding imperfect contexts through targeted questions on applicability, principles, and provisions while identifying missing factors, without any training.
Can Large Language Models Really Recognize Your Name? cs.CR · 2025-05-20 · unverdicted · none · ref 9
LLMs exhibit 20-40% lower recall on ambiguous human names for PII detection, worsening under prompt injections, as shown via the new AmBench benchmark.
It Takes Two: Complementary Self-Distillation for Contextual Integrity in LLMs cs.LG · 2026-05-18 · unverdicted · none · ref 7
SELFCI uses complementary self-distillation with two reverse KL divergences to align LLMs to contextual integrity while preserving utility, outperforming RL baselines like GRPO in agentic settings.
Reinforcement Learning for Scalable and Trustworthy Intelligent Systems cs.LG · 2026-05-08 · unverdicted · none · ref 130
Reinforcement learning is advanced for communication-efficient federated optimization and for preference-aligned, contextually safe policies in large language models.

CI-Bench: Benchmarking contextual integrity of ai assistants on synthetic data

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer