Memory-equipped LLM agents exhibit increasing safety violation rates as memory accumulates across independent tasks, termed temporal memory contamination, detected via a new trigger-probe protocol.
CI-Bench: Benchmarking contextual integrity of ai assistants on synthetic data
7 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
verdicts
UNVERDICTED 7roles
background 1polarities
background 1representative citing papers
PrivScope enforces task-scoped disclosure at the local-cloud boundary in hybrid agents, eliminating profile leakage and halving re-identification risk on medical workflows while preserving task success.
A new benchmark shows enterprise LLM agents violate contextual integrity at rates of 15.8-50.9% with leakage up to 26.7%, and higher task performance correlates with more privacy breaches that model scaling does not fix.
ContextLens improves LLM compliance assessment for GDPR and EU AI Act by grounding imperfect contexts through targeted questions on applicability, principles, and provisions while identifying missing factors, without any training.
LLMs exhibit 20-40% lower recall on ambiguous human names for PII detection, worsening under prompt injections, as shown via the new AmBench benchmark.
SELFCI uses complementary self-distillation with two reverse KL divergences to align LLMs to contextual integrity while preserving utility, outperforming RL baselines like GRPO in agentic settings.
Reinforcement learning is advanced for communication-efficient federated optimization and for preference-aligned, contextually safe policies in large language models.
citing papers explorer
-
Remembering More, Risking More: Longitudinal Safety Risks in Memory-Equipped LLM Agents
Memory-equipped LLM agents exhibit increasing safety violation rates as memory accumulates across independent tasks, termed temporal memory contamination, detected via a new trigger-probe protocol.
-
PrivScope: Task-scoped Disclosure Control for Hybrid Agentic Systems
PrivScope enforces task-scoped disclosure at the local-cloud boundary in hybrid agents, eliminating profile leakage and halving re-identification risk on medical workflows while preserving task success.
-
CI-Work: Benchmarking Contextual Integrity in Enterprise LLM Agents
A new benchmark shows enterprise LLM agents violate contextual integrity at rates of 15.8-50.9% with leakage up to 26.7%, and higher task performance correlates with more privacy breaches that model scaling does not fix.
-
ContextLens: Modeling Imperfect Privacy and Safety Context for Legal Compliance
ContextLens improves LLM compliance assessment for GDPR and EU AI Act by grounding imperfect contexts through targeted questions on applicability, principles, and provisions while identifying missing factors, without any training.
-
Can Large Language Models Really Recognize Your Name?
LLMs exhibit 20-40% lower recall on ambiguous human names for PII detection, worsening under prompt injections, as shown via the new AmBench benchmark.
-
It Takes Two: Complementary Self-Distillation for Contextual Integrity in LLMs
SELFCI uses complementary self-distillation with two reverse KL divergences to align LLMs to contextual integrity while preserving utility, outperforming RL baselines like GRPO in agentic settings.
-
Reinforcement Learning for Scalable and Trustworthy Intelligent Systems
Reinforcement learning is advanced for communication-efficient federated optimization and for preference-aligned, contextually safe policies in large language models.