pith:E25CM7BS
Fine-tuning vs. In-context Learning in Large Language Models: A Formal Language Learning Perspective
Fine-tuning achieves greater language proficiency than in-context learning on in-distribution generalization in formal languages, with equal out-of-distribution performance and diverging inductive biases at high proficiency.
arxiv:2604.23267 v2 · 2026-04-25 · cs.CL · cs.LG
Add to your LaTeX paper
\usepackage{pith}
\pithnumber{E25CM7BSXEBRLIQ6GCAY5C35OP}
Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge
Record completeness
Claims
FT has greater language proficiency than ICL on in-distribution generalization, but both perform equally well on out-of-distribution generalization. Their inductive biases, measured by the correlation in string generation probabilities, are similar when both modes partially learn the language but diverge at higher proficiency levels.
That success on the discriminative test in formal languages (higher probability for in-language strings) accurately measures language proficiency differences between FT and ICL in a manner relevant to natural language, with the formal task providing sufficient control and no contamination.
Fine-tuning shows higher proficiency than in-context learning on in-distribution generalization in formal languages, with equal out-of-distribution performance and diverging inductive biases at high proficiency.
Receipt and verification
| First computed | 2026-05-20T00:05:45.118460Z |
|---|---|
| Builder | pith-number-builder-2026-05-17-v1 |
| Signature | Pith Ed25519
(pith-v1-2026-05) · public key |
| Schema | pith-number/v1.0 |
Canonical hash
26ba267c32b90315a21e30818e8b7d73eae138f2adfbfae1f06132434fded2c0
Aliases
· · · · ·Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/E25CM7BSXEBRLIQ6GCAY5C35OP \
| jq -c '.canonical_record' \
| python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 26ba267c32b90315a21e30818e8b7d73eae138f2adfbfae1f06132434fded2c0
Canonical record JSON
{
"metadata": {
"abstract_canon_sha256": "09f2bbf13aadb8100f5406375201f4bd022551e18008b36ef94f95cdee684bca",
"cross_cats_sorted": [
"cs.LG"
],
"license": "http://creativecommons.org/licenses/by/4.0/",
"primary_cat": "cs.CL",
"submitted_at": "2026-04-25T12:19:25Z",
"title_canon_sha256": "80c87c7ca39a40c603e0714f178c0aabf4cf8c74e21ae3ce61e822d810081ee4"
},
"schema_version": "1.0",
"source": {
"id": "2604.23267",
"kind": "arxiv",
"version": 2
}
}