LLM lies: Hallucinations are not bugs, but features as adversarial examples

Jia-Yu Yao, Kun-Peng Ning, Zhen-Hui Liu, Mu-Nan Ning, Li Yuan · 2023 · arXiv 2310.01469

9 Pith papers cite this work. Polarity classification is still indexing.

9 Pith papers citing it

read on arXiv browse 9 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

Parasites in the Toolchain: A Large-Scale Analysis of Attacks on the MCP Ecosystem

cs.CR · 2025-09-08 · unverdicted · novelty 8.0

This paper defines a new Parasitic Toolchain Attack pattern (MCP-UPD) that assembles legitimate tools into privacy-exfiltrating workflows and reports the first large-scale scan of 12230 MCP tools across 1360 servers revealing systemic vulnerabilities from missing isolation and least-privilege in the

When Search Goes Wrong: Red-Teaming Web-Augmented Large Language Models

cs.CR · 2025-10-09 · unverdicted · novelty 7.0

CREST-Search is a red-teaming framework that crafts seemingly benign search queries to induce unsafe citations from web-augmented LLMs, backed by a new WebSearch-Harm dataset for fine-tuning a specialized attacker model.

Efficient Black-Box Fault Localization for System-Level Test Code Using Large Language Models

cs.SE · 2025-06-23 · unverdicted · novelty 7.0

A black-box LLM approach for fault localization in system-level test code that estimates execution traces from failure logs to rank potential faults with reduced inference cost.

Principled Detection of Hallucinations in Large Language Models via Multiple Testing

cs.CL · 2025-08-25 · unverdicted · novelty 6.0

The method aggregates multiple hallucination evaluation scores via conformal p-values to enable calibrated detection with controlled false alarm rates across LLMs and datasets.

MolReFlect: Towards In-Context Fine-grained Alignments between Molecules and Texts

cs.CL · 2024-11-22 · unverdicted · novelty 6.0

MolReFlect introduces a teacher-student framework that automatically creates fine-grained molecule-text alignments to achieve SOTA results on molecule-caption translation.

LLM-EDT: Large Language Model Enhanced Cross-domain Sequential Recommendation with Dual-phase Training

cs.IR · 2025-11-25 · unverdicted · novelty 5.0

LLM-EDT improves cross-domain sequential recommendation by using LLMs for transferable item augmentation, dual-phase training to handle domain transitions, and domain-aware profiling to build user profiles.

REALISTA: Realistic Latent Adversarial Attacks that Elicit LLM Hallucinations

cs.CL · 2026-05-12

Like a Hammer, It Can Build, It Can Break: Large Language Model Uses, Perceptions, and Adoption in Cybersecurity Operations on Reddit

cs.CR · 2026-04-11

SelfGrader: LLM Jailbreak Detection via Anchored Token-Level Logits

cs.CR · 2026-04-01