pith. sign in
Pith Number

pith:3EUUJX67

pith:2026:3EUUJX67Z5S2ATHNLCN7UPXCJA
not attested not anchored not stored refs pending

Dynamic Adversarial Fine-Tuning Reorganizes Refusal Geometry

Haihua Shen, Junbin Yang, Shan Li, Wenhao Lan, Yijun Yang

Dynamic adversarial fine-tuning reorganizes refusal geometry by relocating the primary carrier from late layers to early layers.

arxiv:2604.27019 v2 · 2026-04-29 · cs.LG · cs.CL · cs.CR

Add to your LaTeX paper
\usepackage{pith}
\pithnumber{3EUUJX67Z5S2ATHNLCN7UPXCJA}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp
2 Internet Archive
3 Author claim open · sign in to claim
4 Citations open
5 Replications open
Portable graph bundle live · download bundle · merged state
The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

These results support a reorganization account rather than a drift-only account, with evidence limited to one backbone and fixed-source attacks. R2D2 preserves a late-layer admissible carrier through step 100 before relocating to an early-layer carrier, while effective rank remains near 1.23--1.27.

C2weakest assumption

That the five-anchor refusal-geometry suite and causal interventions accurately isolate reorganization effects without being confounded by the specific 7B backbone, fixed attack sources, or unmeasured factors in the training dynamics.

C3one line summary

R2D2-style dynamic adversarial fine-tuning reorganizes refusal geometry from late-layer to early-layer carriers in LLMs, achieving lower attack success rates than SFT while maintaining low-dimensional structure.

Receipt and verification
First computed 2026-05-20T00:03:13.064371Z
Builder pith-number-builder-2026-05-17-v1
Signature Pith Ed25519 (pith-v1-2026-05) · public key
Schema pith-number/v1.0

Canonical hash

d92944dfdfcf65a04ced589bfa3ee2481bd135a94f0d604ef1c05e5bafefb3ca

Aliases

arxiv: 2604.27019 · arxiv_version: 2604.27019v2 · doi: 10.48550/arxiv.2604.27019 · pith_short_12: 3EUUJX67Z5S2 · pith_short_16: 3EUUJX67Z5S2ATHN · pith_short_8: 3EUUJX67
Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/3EUUJX67Z5S2ATHNLCN7UPXCJA \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: d92944dfdfcf65a04ced589bfa3ee2481bd135a94f0d604ef1c05e5bafefb3ca
Canonical record JSON
{
  "metadata": {
    "abstract_canon_sha256": "a33e5c77f8972c6899dffa859d630a2f0e07185da66e708ef465b7cc16a55450",
    "cross_cats_sorted": [
      "cs.CL",
      "cs.CR"
    ],
    "license": "http://creativecommons.org/licenses/by-nc-nd/4.0/",
    "primary_cat": "cs.LG",
    "submitted_at": "2026-04-29T12:44:05Z",
    "title_canon_sha256": "927b03b1fcd80ef19340f467342c19c575ccadfebac359d713a90831292e5fed"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2604.27019",
    "kind": "arxiv",
    "version": 2
  }
}