This paper defines a new Parasitic Toolchain Attack pattern (MCP-UPD) that assembles legitimate tools into privacy-exfiltrating workflows and reports the first large-scale scan of 12230 MCP tools across 1360 servers revealing systemic vulnerabilities from missing isolation and least-privilege in the
Watch out for your agents! investigating backdoor threats to llm-based agents.Advances in Neural Information Processing Systems, 37:100938–100964, 2024b
9 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
roles
background 1polarities
background 1representative citing papers
CREST-Search is a red-teaming framework that crafts seemingly benign search queries to induce unsafe citations from web-augmented LLMs, backed by a new WebSearch-Harm dataset for fine-tuning a specialized attacker model.
A black-box LLM approach for fault localization in system-level test code that estimates execution traces from failure logs to rank potential faults with reduced inference cost.
The method aggregates multiple hallucination evaluation scores via conformal p-values to enable calibrated detection with controlled false alarm rates across LLMs and datasets.
MolReFlect introduces a teacher-student framework that automatically creates fine-grained molecule-text alignments to achieve SOTA results on molecule-caption translation.
Security practitioners use LLMs independently for low-risk productivity tasks while showing interest in enterprise platforms, but reliability, verification needs, and security risks limit broader autonomy.
LLM-EDT improves cross-domain sequential recommendation by using LLMs for transferable item augmentation, dual-phase training to handle domain transitions, and domain-aware profiling to build user profiles.
citing papers explorer
-
Parasites in the Toolchain: A Large-Scale Analysis of Attacks on the MCP Ecosystem
This paper defines a new Parasitic Toolchain Attack pattern (MCP-UPD) that assembles legitimate tools into privacy-exfiltrating workflows and reports the first large-scale scan of 12230 MCP tools across 1360 servers revealing systemic vulnerabilities from missing isolation and least-privilege in the
-
When Search Goes Wrong: Red-Teaming Web-Augmented Large Language Models
CREST-Search is a red-teaming framework that crafts seemingly benign search queries to induce unsafe citations from web-augmented LLMs, backed by a new WebSearch-Harm dataset for fine-tuning a specialized attacker model.
-
Efficient Black-Box Fault Localization for System-Level Test Code Using Large Language Models
A black-box LLM approach for fault localization in system-level test code that estimates execution traces from failure logs to rank potential faults with reduced inference cost.
-
Principled Detection of Hallucinations in Large Language Models via Multiple Testing
The method aggregates multiple hallucination evaluation scores via conformal p-values to enable calibrated detection with controlled false alarm rates across LLMs and datasets.
-
MolReFlect: Towards In-Context Fine-grained Alignments between Molecules and Texts
MolReFlect introduces a teacher-student framework that automatically creates fine-grained molecule-text alignments to achieve SOTA results on molecule-caption translation.
-
Like a Hammer, It Can Build, It Can Break: Large Language Model Uses, Perceptions, and Adoption in Cybersecurity Operations on Reddit
Security practitioners use LLMs independently for low-risk productivity tasks while showing interest in enterprise platforms, but reliability, verification needs, and security risks limit broader autonomy.
-
LLM-EDT: Large Language Model Enhanced Cross-domain Sequential Recommendation with Dual-phase Training
LLM-EDT improves cross-domain sequential recommendation by using LLMs for transferable item augmentation, dual-phase training to handle domain transitions, and domain-aware profiling to build user profiles.
- REALISTA: Realistic Latent Adversarial Attacks that Elicit LLM Hallucinations
- SelfGrader: LLM Jailbreak Detection via Anchored Token-Level Logits