Pith Number

pith:G5AADYN4

pith:2026:G5AADYN46YJRDMDAOMT4DUHU4S

not attested not anchored not stored refs resolved

MetaBackdoor: Exploiting Positional Encoding as a Backdoor Attack Surface in LLMs

Ahmed Salem, Andrew Paverd, Jun Sakuma, Mark Russinovich, Rui Wen

LLM backdoors can activate on input length alone by exploiting positional encodings without any text changes.

arxiv:2605.15172 v1 · 2026-05-14 · cs.CR · cs.CL

Open paper page JSON Open Graph Bundle Merged state Verified badge What is a Pith Number?

Add to your LaTeX paper

\usepackage{pith}
\pithnumber{G5AADYN46YJRDMDAOMT4DUHU4S}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp

2 Internet Archive

3 Author claim open · sign in to claim

4 Citations open

5 Replications open

✓ Portable graph bundle live · download bundle · merged state

The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

even a simple length-based positional trigger is sufficient to activate stealthy backdoors... a backdoored LLM can be induced to disclose sensitive internal information, including proprietary system prompts, once a length condition is satisfied.

C2weakest assumption

That the model's internal representations of positional structure can be reliably shaped during training to create a stable, stealthy trigger without affecting normal behavior on non-trigger lengths.

C3one line summary

MetaBackdoor shows that LLMs can be backdoored using positional triggers like sequence length, enabling stealthy activation on clean inputs to leak system prompts or trigger malicious behavior.

References

49 extracted · 49 resolved · 8 Pith anchors

[1] BadNets: Identifying Vulnerabilities in the Machine Learning Model Supply Chain 2017 · arXiv:1708.06733

[2] Targeted Backdoor Attacks on Deep Learning Systems Using Data Poisoning 2017 · arXiv:1712.05526

[3] PPT: Backdoor Attacks on Pre-trained Models via Poisoned Prompt Tuning, 2022

[4] NOTABLE: Transferable Backdoor Attacks Against Prompt-based NLP Models, 2023

[5] Training- free Lexical Backdoor Attacks on Language Models, 2023

Receipt and verification

First computed	2026-05-17T21:40:25.278762Z
Last reissued	2026-05-17T21:57:18.616145Z
Builder	pith-number-builder-2026-05-17-v1
Signature	unsigned_v0
Schema	pith-number/v1.0

Canonical hash

374001e1bcf61311b0607327c1d0f4e4895093a5923e52cd9c352fcfdf77f19b

Aliases

arxiv: 2605.15172 · arxiv_version: 2605.15172v1 · pith_short_12: G5AADYN46YJR · pith_short_16: G5AADYN46YJRDMDA · pith_short_8: G5AADYN4

Agent API

Resolver JSON Graph JSON Events JSON Schema Signing key

Verify this Pith Number yourself

curl -sH 'Accept: application/ld+json' https://pith.science/pith/G5AADYN46YJRDMDAOMT4DUHU4S \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 374001e1bcf61311b0607327c1d0f4e4895093a5923e52cd9c352fcfdf77f19b

Canonical record JSON

{
  "metadata": {
    "abstract_canon_sha256": "7c5ca57f2172a267780a357cf476fa7071ea12d4024ca4f17c95a8152dd2cca4",
    "cross_cats_sorted": [
      "cs.CL"
    ],
    "license": "http://creativecommons.org/licenses/by/4.0/",
    "primary_cat": "cs.CR",
    "submitted_at": "2026-05-14T17:56:22Z",
    "title_canon_sha256": "d7641841524c5cc050e515bb9a715569da76192a4f4e4a57b6cfb2740d976ee4"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2605.15172",
    "kind": "arxiv",
    "version": 1
  }
}