pith:PFQ4IG73
PreFT: Prefill-only finetuning for efficient inference
Applying adapters only during prefill and discarding them afterward raises serving throughput nearly twofold while keeping performance near standard PEFT levels.
arxiv:2605.14217 v1 · 2026-05-14 · cs.LG · cs.AI · cs.CL · cs.SY · eess.SY
Add to your LaTeX paper
\usepackage{pith}
\pithnumber{PFQ4IG73A4U5ZLJPCG76ZK7MIQ}
Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge
Record completeness
Claims
serving multi-user PreFTs is more efficient than traditional PEFTs (1.9× the throughput when serving 512 adapters on Llama 3.1 70B). On RL tasks PreFTs approach parity with standard PEFTs.
That discarding the adapter after prefill does not materially degrade the quality of the generated tokens on downstream tasks, and that any loss can be offset by increasing adapter rank without throughput cost.
Prefill-only adaptation of LLMs yields 1.9x higher throughput for 512 adapters on Llama 3.1 70B with near-parity performance on RL tasks and recoverable loss on SFT.
References
Receipt and verification
| First computed | 2026-05-17T23:39:10.867464Z |
|---|---|
| Builder | pith-number-builder-2026-05-17-v1 |
| Signature | Pith Ed25519
(pith-v1-2026-05) · public key |
| Schema | pith-number/v1.0 |
Canonical hash
7961c41bfb0729dcad2f11bfecabec44385f09e18554f02d884e88f96772842a
Aliases
· · · · ·Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/PFQ4IG73A4U5ZLJPCG76ZK7MIQ \
| jq -c '.canonical_record' \
| python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 7961c41bfb0729dcad2f11bfecabec44385f09e18554f02d884e88f96772842a
Canonical record JSON
{
"metadata": {
"abstract_canon_sha256": "ae94a8a8261dba87cb2891b1f9d169d168d2be8c1ae311ed59ec1cceb1113bef",
"cross_cats_sorted": [
"cs.AI",
"cs.CL",
"cs.SY",
"eess.SY"
],
"license": "http://creativecommons.org/licenses/by/4.0/",
"primary_cat": "cs.LG",
"submitted_at": "2026-05-14T00:19:41Z",
"title_canon_sha256": "f6fbe28f555b9ae084ed42d96daebe9c88f492ec8dabe5a81d3ffc062b157bb5"
},
"schema_version": "1.0",
"source": {
"id": "2605.14217",
"kind": "arxiv",
"version": 1
}
}