super hub Canonical reference

In-context Learning and Induction Heads

Catherine Olsson, Neel Nanda, Nelson Elhage, Nicholas Joseph, Nova DasSarma, Tom Henighan · 2022 · cs.LG · arXiv 2209.11895

Canonical reference. 80% of citing Pith papers cite this work as background.

143 Pith papers citing it

Background 80% of classified citations

open full Pith review browse 143 citing papers more from Catherine Olsson arXiv PDF

abstract

"Induction heads" are attention heads that implement a simple algorithm to complete token sequences like [A][B] ... [A] -> [B]. In this work, we present preliminary and indirect evidence for a hypothesis that induction heads might constitute the mechanism for the majority of all "in-context learning" in large transformer models (i.e. decreasing loss at increasing token indices). We find that induction heads develop at precisely the same point as a sudden sharp increase in in-context learning ability, visible as a bump in the training loss. We present six complementary lines of evidence, arguing that induction heads may be the mechanistic source of general in-context learning in transformer models of any size. For small attention-only models, we present strong, causal evidence; for larger models with MLPs, we present correlational evidence.

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 18 dataset 1 other 1

citation-polarity summary

background 16 unclear 2 support 1 use dataset 1

claims ledger

abstract "Induction heads" are attention heads that implement a simple algorithm to complete token sequences like [A][B] ... [A] -> [B]. In this work, we present preliminary and indirect evidence for a hypothesis that induction heads might constitute the mechanism for the majority of all "in-context learning" in large transformer models (i.e. decreasing loss at increasing token indices). We find that induction heads develop at precisely the same point as a sudden sharp increase in in-context learning ability, visible as a bump in the training loss. We present six complementary lines of evidence, arguin

authors

Catherine Olsson Neel Nanda Nelson Elhage Nicholas Joseph Nova DasSarma Tom Henighan

co-cited works

representative citing papers

Efficiently Representing Algorithms With Chain-of-Thought Transformers

cs.LG · 2026-06-18 · conditional · novelty 8.0

CoT transformers simulate any Word RAM algorithm with poly-logarithmic overhead in three architectures, improving on quadratic TM overhead.

Looped Transformers with Layer Normalization Provably Learn the Power Method

cs.LG · 2026-05-30 · unverdicted · novelty 8.0

Looped linear transformers with LN provably converge via GD to implement the power method on principal component prediction.

Towards Verifiable Transformers: Solver-Checkable Circuit Explanations

cs.LG · 2026-05-21 · unverdicted · novelty 8.0

Presents a solver-verifiable framework for Transformer circuits, with exhaustive checks on small symbolic tasks and surrogate methods for larger models.

WriteSAE: Sparse Autoencoders for Recurrent State

cs.LG · 2026-05-12 · unverdicted · novelty 8.0 · 3 refs

WriteSAE introduces sparse autoencoders with rank-1 matrix atoms for recurrent state updates, allowing replacement tests that outperform deletion on 92.4% of positions and a formula predicting logit changes with R²=0.98.

Slot Machines: How LLMs Keep Track of Multiple Entities

cs.CL · 2026-04-22 · unverdicted · novelty 8.0

LLM activations encode current and prior entities in orthogonal slots, but models only use the current slot for explicit factual retrieval despite prior-slot information being linearly decodable.

The Spectral Lifecycle of Transformer Training: Transient Compression Waves, Persistent Spectral Gradients, and the Q/K--V Asymmetry

cs.LG · 2026-04-03 · unverdicted · novelty 8.0

Transformer weight spectra exhibit transient compression waves that propagate layer-wise, persistent non-monotonic depth gradients in power-law exponents, and Q/K-V asymmetry, with the spectral exponent alpha predicting layer importance and enabling pruning gains of 1.1x-3.6x over Last-N baselines.

The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery

cs.AI · 2024-08-12 · unverdicted · novelty 8.0

The AI Scientist framework enables LLMs to independently conduct the full scientific process from idea generation to paper writing and review, demonstrated across three ML subfields with papers costing under $15 each.

KAN: Kolmogorov-Arnold Networks

cs.LG · 2024-04-30 · conditional · novelty 8.0

KANs with learnable univariate spline activations on edges achieve better accuracy than MLPs with fewer parameters, faster scaling, and direct visualization for scientific discovery.

Localizing Model Behavior with Path Patching

cs.LG · 2023-04-12 · unverdicted · novelty 8.0

Path patching provides a method to express and quantitatively test hypotheses that neural network behaviors are localized to sets of paths.

Interpretability in the Wild: a Circuit for Indirect Object Identification in GPT-2 small

cs.LG · 2022-11-01 · conditional · novelty 8.0

GPT-2 small solves indirect object identification via a circuit of 26 attention heads organized into seven functional classes discovered through causal interventions.

ReContext: Recursive Evidence Replay as LLM Harness for Long-Context Reasoning

cs.AI · 2026-07-02 · unverdicted · novelty 7.0

RECONTEXT is a recursive evidence replay technique that improves long-context reasoning in LLMs by constructing and replaying a query-conditioned evidence pool before final generation.

Can Language Models Actually Retrieve In-Context? Drowning in Documents at Million Token Scale

cs.CL · 2026-07-01 · unverdicted · novelty 7.0

A 0.6B LM with length-aware attention adjustments performs competitive in-context retrieval at million-token scale on MS MARCO, NQ, and LIMIT benchmarks.

Logit-Contribution Scoring Identifies Non-Literal Retrieval Heads

cs.CL · 2026-07-01 · unverdicted · novelty 7.0

LOCOS scores attention heads via OV-circuit output projection onto answer-token unembedding directions and identifies non-literal retrieval heads whose ablation collapses performance on non-literal benchmarks more than prior literal-copy detectors.

SemRF: A Semantic Reference Frame for Residual-Stream Dynamics in Language Models

cs.LG · 2026-06-30 · unverdicted · novelty 7.0

SemRF supplies fixed semantic anchors and pseudo-inverse tying to produce stable coordinates for residual dynamics, Voronoi traces, and minimum-action canonical paths that link to parameter efficiency under controlled interface error.

ECHO: Learning Epistemically Adaptive Language Agents with Turn-Level Credit

cs.MA · 2026-06-29 · unverdicted · novelty 7.0

ECHO is a clipped policy-gradient method that uses posterior-sensitive rewards to give turn-level epistemic credit in multi-turn information-seeking tasks, outperforming trajectory-level GRPO on a new Clue Selector Game benchmark.

Symbolic Mechanistic Data Attribution: Tracing Training Influence to Learned Behavioral Policies

cs.LG · 2026-06-28 · unverdicted · novelty 7.0

SMDA fits ridge regression on SAE features to distill symbolic policies then decomposes each SFT example's influence via feature-activation and output-probability deltas, demonstrated on refusal behavior in Llama-3.2-3B-Instruct.

DREAM: Dense Retrieval Embeddings via Autoregressive Modeling

cs.CL · 2026-06-23 · unverdicted · novelty 7.0

DREAM enables training of dense retrieval embeddings using autoregressive next-token prediction from LLMs by modulating attention with retriever scores.

Mind the Heads: Topological Representation Alignment for Multimodal LLMs

cs.CV · 2026-06-22 · conditional · novelty 7.0

HeRA aligns least-aligned attention heads in MLLMs using an MKNN-based contrastive objective to preserve cross-modal topological structure, yielding gains on vision-centric tasks and reduced hallucinations across 18 benchmarks.

Safe to Check, Unsafe to Use: Relinking at the Compression Boundary of LLM Agents

cs.CR · 2026-06-19 · unverdicted · novelty 7.0

Relinking is a new compression-boundary attack on LLM agents where summarization of split benign fragments produces malicious instructions, shown via Relink tool at 86.9% success rate and mitigated by KBRA defense to 0%.

Reroute, Don't Remove: Recoverable Visual Token Routing for Vision-Language Models

cs.CV · 2026-06-10 · conditional · novelty 7.0

Reroute turns irreversible visual-token pruning into recoverable routing that reuses existing attention scores, improving grounding performance under aggressive reduction on LLaVA-1.5 and Qwen while preserving TFLOPs and KV-cache budgets.

Phase Transitions in Attention: A Bayesian Theory of Copy Head Emergence

stat.ML · 2026-06-10 · unverdicted · novelty 7.0

Bayesian reduction of attention posterior on copy task predicts first-order phase transition for softmax attention and second-order followed by crossover for linear attention.

STRIDE: Training Data Attribution via Sparse Recovery from Subset Perturbations

cs.LG · 2026-06-03 · unverdicted · novelty 7.0

STRIDE formulates TDA as sparse recovery using steering operators that mimic subset training effects in activation space, claiming SOTA LLM pre-training attribution at 13x prior speed.

What Makes Chain-of-Thought Work at Probe Time? Local Co-occurrence Rather Than Global Derivation

cs.AI · 2026-05-26 · unverdicted · novelty 7.0

CoT probe-time gains arise primarily from lexical activation and short-range token co-occurrence rather than sentence-level logical derivation.

Geometry-Adaptive Explainer for Faithful Dictionary-Based Interpretability under Distribution Shift

cs.LG · 2026-05-21 · unverdicted · novelty 7.0

GAE reduces the faithfulness gap in dictionary-based explainers under distribution shift by geometrically realigning the ID dictionary to the OOD-active subspace, with a quadratic excess-loss bound.

citing papers explorer

Showing 50 of 143 citing papers.

Efficiently Representing Algorithms With Chain-of-Thought Transformers cs.LG · 2026-06-18 · conditional · none · ref 10 · internal anchor
CoT transformers simulate any Word RAM algorithm with poly-logarithmic overhead in three architectures, improving on quadratic TM overhead.
Looped Transformers with Layer Normalization Provably Learn the Power Method cs.LG · 2026-05-30 · unverdicted · none · ref 74 · internal anchor
Looped linear transformers with LN provably converge via GD to implement the power method on principal component prediction.
Towards Verifiable Transformers: Solver-Checkable Circuit Explanations cs.LG · 2026-05-21 · unverdicted · none · ref 14 · internal anchor
Presents a solver-verifiable framework for Transformer circuits, with exhaustive checks on small symbolic tasks and surrogate methods for larger models.
WriteSAE: Sparse Autoencoders for Recurrent State cs.LG · 2026-05-12 · unverdicted · none · ref 41 · 3 links · internal anchor
WriteSAE introduces sparse autoencoders with rank-1 matrix atoms for recurrent state updates, allowing replacement tests that outperform deletion on 92.4% of positions and a formula predicting logit changes with R²=0.98.
Slot Machines: How LLMs Keep Track of Multiple Entities cs.CL · 2026-04-22 · unverdicted · none · ref 10 · internal anchor
LLM activations encode current and prior entities in orthogonal slots, but models only use the current slot for explicit factual retrieval despite prior-slot information being linearly decodable.
The Spectral Lifecycle of Transformer Training: Transient Compression Waves, Persistent Spectral Gradients, and the Q/K--V Asymmetry cs.LG · 2026-04-03 · unverdicted · none · ref 12 · internal anchor
Transformer weight spectra exhibit transient compression waves that propagate layer-wise, persistent non-monotonic depth gradients in power-law exponents, and Q/K-V asymmetry, with the spectral exponent alpha predicting layer importance and enabling pruning gains of 1.1x-3.6x over Last-N baselines.
The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery cs.AI · 2024-08-12 · unverdicted · none · ref 78 · internal anchor
The AI Scientist framework enables LLMs to independently conduct the full scientific process from idea generation to paper writing and review, demonstrated across three ML subfields with papers costing under $15 each.
KAN: Kolmogorov-Arnold Networks cs.LG · 2024-04-30 · conditional · none · ref 80 · internal anchor
KANs with learnable univariate spline activations on edges achieve better accuracy than MLPs with fewer parameters, faster scaling, and direct visualization for scientific discovery.
Localizing Model Behavior with Path Patching cs.LG · 2023-04-12 · unverdicted · none · ref 47 · internal anchor
Path patching provides a method to express and quantitatively test hypotheses that neural network behaviors are localized to sets of paths.
Interpretability in the Wild: a Circuit for Indirect Object Identification in GPT-2 small cs.LG · 2022-11-01 · conditional · none · ref 35 · internal anchor
GPT-2 small solves indirect object identification via a circuit of 26 attention heads organized into seven functional classes discovered through causal interventions.
ReContext: Recursive Evidence Replay as LLM Harness for Long-Context Reasoning cs.AI · 2026-07-02 · unverdicted · none · ref 53 · internal anchor
RECONTEXT is a recursive evidence replay technique that improves long-context reasoning in LLMs by constructing and replaying a query-conditioned evidence pool before final generation.
Can Language Models Actually Retrieve In-Context? Drowning in Documents at Million Token Scale cs.CL · 2026-07-01 · unverdicted · none · ref 39 · internal anchor
A 0.6B LM with length-aware attention adjustments performs competitive in-context retrieval at million-token scale on MS MARCO, NQ, and LIMIT benchmarks.
Logit-Contribution Scoring Identifies Non-Literal Retrieval Heads cs.CL · 2026-07-01 · unverdicted · none · ref 25 · internal anchor
LOCOS scores attention heads via OV-circuit output projection onto answer-token unembedding directions and identifies non-literal retrieval heads whose ablation collapses performance on non-literal benchmarks more than prior literal-copy detectors.
SemRF: A Semantic Reference Frame for Residual-Stream Dynamics in Language Models cs.LG · 2026-06-30 · unverdicted · none · ref 13 · internal anchor
SemRF supplies fixed semantic anchors and pseudo-inverse tying to produce stable coordinates for residual dynamics, Voronoi traces, and minimum-action canonical paths that link to parameter efficiency under controlled interface error.
ECHO: Learning Epistemically Adaptive Language Agents with Turn-Level Credit cs.MA · 2026-06-29 · unverdicted · none · ref 47 · internal anchor
ECHO is a clipped policy-gradient method that uses posterior-sensitive rewards to give turn-level epistemic credit in multi-turn information-seeking tasks, outperforming trajectory-level GRPO on a new Clue Selector Game benchmark.
Symbolic Mechanistic Data Attribution: Tracing Training Influence to Learned Behavioral Policies cs.LG · 2026-06-28 · unverdicted · none · ref 59 · internal anchor
SMDA fits ridge regression on SAE features to distill symbolic policies then decomposes each SFT example's influence via feature-activation and output-probability deltas, demonstrated on refusal behavior in Llama-3.2-3B-Instruct.
DREAM: Dense Retrieval Embeddings via Autoregressive Modeling cs.CL · 2026-06-23 · unverdicted · none · ref 24 · internal anchor
DREAM enables training of dense retrieval embeddings using autoregressive next-token prediction from LLMs by modulating attention with retriever scores.
Mind the Heads: Topological Representation Alignment for Multimodal LLMs cs.CV · 2026-06-22 · conditional · none · ref 26 · internal anchor
HeRA aligns least-aligned attention heads in MLLMs using an MKNN-based contrastive objective to preserve cross-modal topological structure, yielding gains on vision-centric tasks and reduced hallucinations across 18 benchmarks.
Safe to Check, Unsafe to Use: Relinking at the Compression Boundary of LLM Agents cs.CR · 2026-06-19 · unverdicted · none · ref 38 · internal anchor
Relinking is a new compression-boundary attack on LLM agents where summarization of split benign fragments produces malicious instructions, shown via Relink tool at 86.9% success rate and mitigated by KBRA defense to 0%.
Reroute, Don't Remove: Recoverable Visual Token Routing for Vision-Language Models cs.CV · 2026-06-10 · conditional · none · ref 54 · internal anchor
Reroute turns irreversible visual-token pruning into recoverable routing that reuses existing attention scores, improving grounding performance under aggressive reduction on LLaVA-1.5 and Qwen while preserving TFLOPs and KV-cache budgets.
Phase Transitions in Attention: A Bayesian Theory of Copy Head Emergence stat.ML · 2026-06-10 · unverdicted · none · ref 17 · internal anchor
Bayesian reduction of attention posterior on copy task predicts first-order phase transition for softmax attention and second-order followed by crossover for linear attention.
STRIDE: Training Data Attribution via Sparse Recovery from Subset Perturbations cs.LG · 2026-06-03 · unverdicted · none · ref 90 · internal anchor
STRIDE formulates TDA as sparse recovery using steering operators that mimic subset training effects in activation space, claiming SOTA LLM pre-training attribution at 13x prior speed.
What Makes Chain-of-Thought Work at Probe Time? Local Co-occurrence Rather Than Global Derivation cs.AI · 2026-05-26 · unverdicted · none · ref 12 · internal anchor
CoT probe-time gains arise primarily from lexical activation and short-range token co-occurrence rather than sentence-level logical derivation.
Geometry-Adaptive Explainer for Faithful Dictionary-Based Interpretability under Distribution Shift cs.LG · 2026-05-21 · unverdicted · none · ref 2 · internal anchor
GAE reduces the faithfulness gap in dictionary-based explainers under distribution shift by geometrically realigning the ID dictionary to the OOD-active subspace, with a quadratic excess-loss bound.
Where Pretraining writes and Alignment reads: the asymmetry of Transformer weight space cs.LG · 2026-05-15 · unverdicted · none · ref 43 · internal anchor
Pretraining and alignment induce asymmetric geometric traces in transformer weights because alignment updates concentrate in read pathways due to activation covariance while write pathways inherit less structure from alignment losses.
Assessing the Creativity of Large Language Models: Testing, Limits, and New Frontiers cs.AI · 2026-05-13 · conditional · none · ref 13 · internal anchor
The Divergent Remote Association Test (DRAT) is the first creativity test that significantly predicts LLMs' scientific ideation ability, unlike prior tests such as DAT or RAT.
Learning Less Is More: Premature Upper-Layer Attention Specialization Hurts Language Model Pretraining cs.CL · 2026-05-11 · unverdicted · none · ref 11 · internal anchor
Temporarily reducing the learning rate on upper-layer query and key projections during early GPT pretraining prevents premature attention specialization and improves model performance.
Self-Attention as a Covariance Readout: A Unified View of In-Context Learning and Repetition cs.LG · 2026-05-11 · unverdicted · none · ref 24 · internal anchor
Self-attention acts as a covariance readout that unifies in-context learning via population gradient descent and repetitive generation via asymptotic Markov behavior.
From Mechanistic to Compositional Interpretability cs.LG · 2026-05-09 · unverdicted · none · ref 52 · 2 links · internal anchor
The paper introduces compositional interpretability as a category-theoretic framework that casts mechanistic explanations as commuting syntactic-semantic mappings optimized under faithfulness and complexity constraints derived from minimum description length.
Understanding Performance Collapse in Layer-Pruned Large Language Models via Decision Representation Transitions cs.CL · 2026-05-08 · unverdicted · none · ref 61 · internal anchor
Performance collapse in layer-pruned LLMs stems from disrupting the Silent Phase of decision-making, which blocks the transition to correct predictions, while the later Decisive Phase is robust to pruning.
Elicitation Matters: How Prompts and Query Protocols Shape LLM Surrogates under Sparse Observations cs.CL · 2026-05-06 · unverdicted · none · ref 15 · internal anchor
LLM surrogate beliefs under sparse observations depend on prompts and query protocols, with structural prompts as priors, pointwise vs joint querying producing different beliefs, and sequential evidence causing non-monotonic updates that affect acquisition and regret.
Task Vector Geometry Underlies Dual Modes of Task Inference in Transformers cs.LG · 2026-05-05 · unverdicted · none · ref 26 · internal anchor
In a controlled synthetic setting, transformers implement in-distribution task inference via convex combinations of task vectors and out-of-distribution inference via nearly orthogonal extrapolative representations.
Tracing the Dynamics of Refusal: Exploiting Latent Refusal Trajectories for Robust Jailbreak Detection cs.CR · 2026-05-02 · unverdicted · none · ref 4 · 2 links · internal anchor
Causal tracing reveals a persistent Refusal Trajectory in LLM hidden states; SALO detector using sparse activations from a layer window improves jailbreak detection across Qwen, Llama, and Mistral models.
How Much Is One Recurrence Worth? Iso-Depth Scaling Laws for Looped Language Models cs.LG · 2026-04-22 · unverdicted · none · ref 52 · internal anchor
A fitted iso-depth scaling law measures that one recurrence in looped transformers is worth r^0.46 unique blocks in validation loss.
Cell-Based Representation of Relational Binding in Language Models cs.CL · 2026-04-21 · unverdicted · none · ref 52 · internal anchor
Large language models encode relational bindings via a cell-based representation: a low-dimensional linear subspace in which each cell corresponds to an entity-relation index pair and attributes are retrieved from the matching cell.
HeadRank: Decoding-Free Passage Reranking via Preference-Aligned Attention Heads cs.IR · 2026-04-19 · unverdicted · none · ref 20 · internal anchor
HeadRank lifts preference optimization into attention space via entropy-regularized head selection and distribution regularizers to sharpen discriminability for efficient listwise reranking.
Wiring the 'Why': A Unified Taxonomy and Survey of Abductive Reasoning in LLMs cs.AI · 2026-04-09 · accept · none · ref 74 · internal anchor
The paper delivers the first survey of abductive reasoning in LLMs, a unified two-stage taxonomy, a compact benchmark, and an analysis of gaps relative to deductive and inductive reasoning.
Screening Is Enough cs.LG · 2026-04-01 · unverdicted · none · ref 26 · internal anchor
Multiscreen replaces softmax attention with screening to provide absolute query-key relevance, resulting in models with 30% fewer parameters that maintain stable performance at long contexts.
Sharp Capacity Scaling of Spectral Optimizers in Learning Associative Memory cs.LG · 2026-03-27 · unverdicted · none · ref 40 · internal anchor
Muon achieves higher storage capacity than SGD and matches Newton's method in one-step recovery rates for associative memory under power-law distributions, while saturating at larger critical batch sizes and showing faster initial multi-step dynamics.
How Do Transformers Learn to Associate Tokens: Gradient Leading Terms Bring Mechanistic Interpretability cs.CL · 2026-01-27 · unverdicted · none · ref 15 · internal anchor
Transformer weights at early training stages are closed-form compositions of bigram, token-interchangeability, and context mappings that directly reflect text-corpus statistics and explain the emergence of semantic associations.
The Bayesian Geometry of Transformer Attention cs.LG · 2025-12-27 · unverdicted · none · ref 2 · internal anchor
Small transformers reproduce known Bayesian posteriors with 10^{-3} to 10^{-4} bit accuracy in verifiable wind-tunnel tasks via residual belief states, FFN updates, and attention routing, while MLPs do not.
Soft Head Selection for Injecting ICL-Derived Task Embeddings cs.CL · 2025-07-28 · conditional · none · ref 14 · internal anchor
SITE applies soft gradient-based head selection to inject ICL-derived task embeddings, outperforming prior embedding adaptation and few-shot ICL across generation, reasoning, and NLU tasks on 12 LLMs from 4B to 70B parameters.
A Markov Categorical Framework for Language Modeling cs.LG · 2025-07-25 · unverdicted · none · ref 21 · internal anchor
A Markov category framework for language models provides an information-theoretic rationale for speculative decoding and shows that a quadratic surrogate to negative log-likelihood induces generalized CCA alignment in linear-softmax heads after normalization.
A ghost mechanism: An analytical model of abrupt learning in recurrent networks cs.LG · 2025-01-04 · unverdicted · none · ref 25 · internal anchor
The ghost mechanism derives a 1D canonical model of abrupt learning in RNNs from ghost points of saddle-node bifurcations, predicting an inverse-power-law critical learning rate and gradient-based failure modes.
Sycophancy to Subterfuge: Investigating Reward-Tampering in Large Language Models cs.AI · 2024-06-14 · conditional · none · ref 273 · internal anchor
LLMs trained on simple specification gaming generalize to zero-shot reward tampering including rewriting their own reward function.
Jamba: A Hybrid Transformer-Mamba Language Model cs.CL · 2024-03-28 · conditional · none · ref 35 · internal anchor
Jamba presents a hybrid Transformer-Mamba MoE architecture for LLMs that delivers state-of-the-art benchmark performance and strong results up to 256K token contexts while fitting in one 80GB GPU with high throughput.
Steering Language Models With Activation Engineering cs.CL · 2023-08-20 · unverdicted · none · ref 132 · internal anchor
Activation Addition steers language models by adding contrastive activation vectors from prompt pairs to control high-level properties like sentiment and toxicity at inference time without training.
Do Models Read What They Write? Causal Registers in Scratchpad Reasoning cs.LG · 2026-06-28 · unverdicted · none · ref 6 · internal anchor
State-writing models causally use edited scratchpad states in a controlled task at 80-91% accuracy on held-out examples, unlike final-answer-only and pretrained controls.
Developmental Trajectories of Situation Modeling and Mentalizing in Transformer Language Models cs.CL · 2026-06-26 · unverdicted · none · ref 78 · internal anchor
Larger LLMs acquire basic situation modeling before mentalizing on false-belief tasks, with performance depending on size, training volume, and post-training, yet remaining sensitive to non-factive verbs and agent knowledge states.
Generating Special Triangulations with Transformers hep-th · 2026-06-25 · unverdicted · none · ref 47 · internal anchor
Transformers generate new FRSTs of 4D reflexive polytopes across size ranges and self-improve by retraining on their own outputs.

In-context Learning and Induction Heads

hub tools

citation-role summary

citation-polarity summary

claims ledger

authors

co-cited works

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer