pith:7YXM6JRY
A Survey on Efficient Inference for Large Language Models
A survey organizes methods for efficient large language model inference into data-level, model-level, and system-level categories and benchmarks representative techniques.
arxiv:2404.14294 v3 · 2024-04-22 · cs.CL · cs.AI
Add to your LaTeX paper
\usepackage{pith}
\pithnumber{7YXM6JRYOH4O5NBMCHKEPDPSW2}
Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge
Record completeness
Claims
This paper presents a comprehensive survey of the existing literature on efficient LLM inference... organized into data-level, model-level, and system-level optimization... with comparative experiments on representative methods.
That the chosen representative methods and experimental comparisons fairly represent the broader literature and yield generalizable quantitative insights without significant selection bias.
The paper surveys techniques to speed up and reduce the resource needs of LLM inference, organized by data-level, model-level, and system-level changes, with comparative experiments on representative methods.
References
Formal links
Cited by
Receipt and verification
| First computed | 2026-05-17T23:38:53.798407Z |
|---|---|
| Builder | pith-number-builder-2026-05-17-v1 |
| Signature | Pith Ed25519
(pith-v1-2026-05) · public key |
| Schema | pith-number/v1.0 |
Canonical hash
fe2ecf263871f8eeb42c11d4478df2b69b77748d33f8d92acab2b44d81666059
Aliases
· · · · ·Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/7YXM6JRYOH4O5NBMCHKEPDPSW2 \
| jq -c '.canonical_record' \
| python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: fe2ecf263871f8eeb42c11d4478df2b69b77748d33f8d92acab2b44d81666059
Canonical record JSON
{
"metadata": {
"abstract_canon_sha256": "7e45755716429abd0dc0e09cd3eff786a25f8857d55ccb1a7e23f2fa7d08b786",
"cross_cats_sorted": [
"cs.AI"
],
"license": "http://arxiv.org/licenses/nonexclusive-distrib/1.0/",
"primary_cat": "cs.CL",
"submitted_at": "2024-04-22T15:53:08Z",
"title_canon_sha256": "0158e010d7858a65e7781dd03ec62b813bbae982fd020a8150281fd273403c03"
},
"schema_version": "1.0",
"source": {
"id": "2404.14294",
"kind": "arxiv",
"version": 3
}
}