Pith Number

pith:PFZHD3TF

pith:2022:PFZHD3TFIWOEKN4ZU3CDZP3APP

not attested not anchored not stored refs resolved

What learning algorithm is in-context learning? Investigations with linear models

Dale Schuurmans, Denny Zhou, Ekin Aky\"urek, Jacob Andreas, Tengyu Ma

Transformers implement gradient descent and ridge regression implicitly when doing in-context learning on linear tasks.

arxiv:2211.15661 v3 · 2022-11-28 · cs.LG · cs.CL

Open paper page JSON Open Graph Bundle Merged state What is a Pith Number?

Add to your LaTeX paper

\usepackage{pith}
\pithnumber{PFZHD3TFIWOEKN4ZU3CDZP3APP}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more

Record completeness

1 Bitcoin timestamp

2 Internet Archive

3 Author claim open · sign in to claim

4 Citations open

5 Replications open

✓ Portable graph bundle live · download bundle · merged state

The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

Trained in-context learners closely match the predictors computed by gradient descent, ridge regression, and exact least-squares regression, transitioning between different predictors as transformer depth and dataset noise vary, and converging to Bayesian estimators for large widths and depths.

C2weakest assumption

That results on linear regression as a prototypical problem will extend to the more complex, non-linear tasks typical of real in-context learning in language models.

C3one line summary

Transformers performing in-context learning implicitly implement gradient descent, ridge regression, and least-squares predictors for linear models, with behavior shifting based on model depth, width, and data noise.

References

31 extracted · 31 resolved · 6 Pith anchors

[1] Understanding intermediate layers using linear classifier probes 2016 · arXiv:1610.01644

[2] Hoffman, David Pfau, Tom Schaul, and Nando de Freitas 2016

[3] Layer Normalization 2016 · arXiv:1607.06450

[4] Tom B. Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert - Voss, Gretc 2020

[5] Thread: circuits 2020

Formal links

1 machine-checked theorem link

Cited by

17 papers in Pith

Online In-Context Distillation for Low-Resource Vision Language Models

Otter: A Multi-Modal Model with In-Context Instruction Tuning

Meta-Harness: End-to-End Optimization of Model Harnesses

Medusa: Simple LLM Inference Acceleration Framework with Multiple Decoding Heads

Stories in Space: In-Context Learning Trajectories in Conceptual Belief Space

Receipt and verification

First computed	2026-05-17T23:38:13.930507Z
Builder	pith-number-builder-2026-05-17-v1
Signature	Pith Ed25519 (`pith-v1-2026-05`) · public key
Schema	pith-number/v1.0

Canonical hash

797271ee65459c453799a6c43cbf607be37dc8edbd11f2091c68b86538df2cc0

Aliases

arxiv: 2211.15661 · arxiv_version: 2211.15661v3 · doi: 10.48550/arxiv.2211.15661 · pith_short_12: PFZHD3TFIWOE · pith_short_16: PFZHD3TFIWOEKN4Z · pith_short_8: PFZHD3TF

Agent API

Resolver JSON Graph JSON Events JSON Schema Signing key

Verify this Pith Number yourself

curl -sH 'Accept: application/ld+json' https://pith.science/pith/PFZHD3TFIWOEKN4ZU3CDZP3APP \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 797271ee65459c453799a6c43cbf607be37dc8edbd11f2091c68b86538df2cc0

Canonical record JSON

{
  "metadata": {
    "abstract_canon_sha256": "e8bab37bc0e56fc58a49bf89d5cf2bbb25839cce6a80f49b692916efa402136b",
    "cross_cats_sorted": [
      "cs.CL"
    ],
    "license": "http://arxiv.org/licenses/nonexclusive-distrib/1.0/",
    "primary_cat": "cs.LG",
    "submitted_at": "2022-11-28T18:59:51Z",
    "title_canon_sha256": "6caaadf80372d564f29b370e6c37cced5e9e052c272d7b54f59c819c6937e6d2"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2211.15661",
    "kind": "arxiv",
    "version": 3
  }
}