What can transformers learn in-context? a case study of simple function classes.ArXiv, abs/2208.01066

Garg, Shivam, Tsipras, Dimitris, Liang, Percy, Valiant, Gregory , year = · 2022 · arXiv 2208.01066

6 Pith papers cite this work. Polarity classification is still indexing.

6 Pith papers citing it

read on arXiv browse 6 citing papers

citation-role summary

background 2 method 1

citation-polarity summary

background 2 use method 1

representative citing papers

What learning algorithm is in-context learning? Investigations with linear models

cs.LG · 2022-11-28 · accept · novelty 8.0

Transformers performing in-context learning implicitly implement gradient descent, ridge regression, and least-squares predictors for linear models, with behavior shifting based on model depth, width, and data noise.

Rethinking the Role of Demonstrations: What Makes In-Context Learning Work?

cs.CL · 2022-02-25 · accept · novelty 8.0

Randomly replacing labels in in-context demonstrations barely hurts performance, showing that label space, input distribution, and sequence format drive in-context learning more than ground-truth labels.

Spectral Transformer Neural Processes

cs.LG · 2026-05-10 · unverdicted · novelty 6.0

STNPs extend TNPs with a spectral aggregator that estimates context spectra, forms spectral mixtures, and injects task-adaptive frequency features to better handle periodicity.

TabICL: A Tabular Foundation Model for In-Context Learning on Large Data

cs.LG · 2025-02-08 · unverdicted · novelty 6.0

TabICL scales in-context learning to large tabular data via column-then-row attention for row embeddings followed by a transformer, matching TabPFNv2 speed and performance while outperforming it and CatBoost on datasets over 10K samples.

One for All: A Non-Linear Transformer can Enable Cross-Domain Generalization for In-Context Reinforcement Learning

cs.LG · 2026-05-10 · unverdicted · novelty 5.0

Non-linear transformers enable cross-domain generalization in in-context RL by representing value functions from different domains with shared weights inside a shared RKHS.

When Context Sticks: Studying Interference in In-Context Learning

cs.LG · 2026-04-25 · unverdicted · novelty 5.0

In-context learning shows persistent interference from prior examples, with more misleading linear examples degrading quadratic predictions and training curricula modulating recovery speed.

citing papers explorer

Showing 6 of 6 citing papers.

What learning algorithm is in-context learning? Investigations with linear models cs.LG · 2022-11-28 · accept · none · ref 11
Transformers performing in-context learning implicitly implement gradient descent, ridge regression, and least-squares predictors for linear models, with behavior shifting based on model depth, width, and data noise.
Rethinking the Role of Demonstrations: What Makes In-Context Learning Work? cs.CL · 2022-02-25 · accept · none · ref 188
Randomly replacing labels in in-context demonstrations barely hurts performance, showing that label space, input distribution, and sequence format drive in-context learning more than ground-truth labels.
Spectral Transformer Neural Processes cs.LG · 2026-05-10 · unverdicted · none · ref 16
STNPs extend TNPs with a spectral aggregator that estimates context spectra, forms spectral mixtures, and injects task-adaptive frequency features to better handle periodicity.
TabICL: A Tabular Foundation Model for In-Context Learning on Large Data cs.LG · 2025-02-08 · unverdicted · none · ref 171
TabICL scales in-context learning to large tabular data via column-then-row attention for row embeddings followed by a transformer, matching TabPFNv2 speed and performance while outperforming it and CatBoost on datasets over 10K samples.
One for All: A Non-Linear Transformer can Enable Cross-Domain Generalization for In-Context Reinforcement Learning cs.LG · 2026-05-10 · unverdicted · none · ref 7
Non-linear transformers enable cross-domain generalization in in-context RL by representing value functions from different domains with shared weights inside a shared RKHS.
When Context Sticks: Studying Interference in In-Context Learning cs.LG · 2026-04-25 · unverdicted · none · ref 11
In-context learning shows persistent interference from prior examples, with more misleading linear examples degrading quadratic predictions and training curricula modulating recovery speed.

What can transformers learn in-context? a case study of simple function classes.ArXiv, abs/2208.01066

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer