Pith Number

pith:F3FPAIIF

pith:2024:F3FPAIIFWWF3BMZ5TRBYHCPUNI

not attested not anchored not stored refs resolved

Defending Against Indirect Prompt Injection Attacks With Spotlighting

Emre Kiciman, Federico Zarfati, Gary Lopez, Keegan Hines, Matthew Hall, Yonatan Zunger

Spotlighting uses input transformations to mark data origins, letting LLMs ignore embedded adversarial instructions and cutting indirect prompt injection success from over 50% to under 2%.

arxiv:2403.14720 v1 · 2024-03-20 · cs.CR · cs.CL · cs.LG

Open paper page JSON Open Graph Bundle Merged state Verified badge What is a Pith Number?

Add to your LaTeX paper

\usepackage{pith}
\pithnumber{F3FPAIIFWWF3BMZ5TRBYHCPUNI}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp

2 Internet Archive

3 Author claim open · sign in to claim

4 Citations open

5 Replications open

✓ Portable graph bundle live · download bundle · merged state

The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

spotlighting reduces the attack success rate from greater than 50% to below 2% in our experiments with minimal impact on task efficacy.

C2weakest assumption

That the chosen input transformations create a reliable, continuous provenance signal that LLMs will consistently interpret and follow without being bypassed by new attack variants.

C3one line summary

Spotlighting prompt transformations cut indirect prompt injection success rates from >50% to <2% on GPT models while preserving task performance.

References

22 extracted · 22 resolved · 12 Pith anchors

[1] Code Llama: Open Foundation Models for Code 2023 · arXiv:2308.12950

[2] Benchmarking and Defending Against Indirect Prompt Injection Attacks on Large Language Models 2023

[3] SuperGLUE: A Stickier Benchmark for General-Purpose Language Understanding Systems 1905 · arXiv:1905.00537

[4] SQuAD: 100,000+ Questions for Machine Comprehension of Text 2016 · arXiv:1606.05250

[5] Learning Word Vectors for Sentiment Analysis, 2011

Formal links

3 machine-checked theorem links

Cited by

34 papers in Pith

Progent: Securing AI Agents with Privilege Control

Robustness via Referencing: Defending against Prompt Injection Attacks by Referencing the Executed Instruction

Evaluating Prompt Injection Defenses for Educational LLM Tutors: Security-Usability-Latency Trade-offs

Beyond Pattern Matching: Seven Cross-Domain Techniques for Prompt Injection Detection

LivePI: More Realistic Benchmarking of Agents Against Indirect Prompt Injection

Receipt and verification

First computed	2026-05-17T23:39:21.453581Z
Builder	pith-number-builder-2026-05-17-v1
Signature	Pith Ed25519 (`pith-v1-2026-05`) · public key
Schema	pith-number/v1.0

Canonical hash

2ecaf02105b58bb0b33d9c438389f46a3d996bf390d91fe8aa9b2b65415e8c1f

Aliases

arxiv: 2403.14720 · arxiv_version: 2403.14720v1 · doi: 10.48550/arxiv.2403.14720 · pith_short_12: F3FPAIIFWWF3 · pith_short_16: F3FPAIIFWWF3BMZ5 · pith_short_8: F3FPAIIF

Agent API

Resolver JSON Graph JSON Events JSON Schema Signing key

Verify this Pith Number yourself

curl -sH 'Accept: application/ld+json' https://pith.science/pith/F3FPAIIFWWF3BMZ5TRBYHCPUNI \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 2ecaf02105b58bb0b33d9c438389f46a3d996bf390d91fe8aa9b2b65415e8c1f

Canonical record JSON

{
  "metadata": {
    "abstract_canon_sha256": "fea25809dd0b1a10541abd53367f0aa00aebbf135846fabc969247d4eaeb46eb",
    "cross_cats_sorted": [
      "cs.CL",
      "cs.LG"
    ],
    "license": "http://creativecommons.org/licenses/by/4.0/",
    "primary_cat": "cs.CR",
    "submitted_at": "2024-03-20T15:26:23Z",
    "title_canon_sha256": "8301859de849661297e773cbeb44b0110dbd106073897754b6d4c5439be2684a"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2403.14720",
    "kind": "arxiv",
    "version": 1
  }
}