On the Biology of a Large Language Model,

· 2025

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

Reading Task Failure Off the Activations: A Sparse-Feature Audit of GPT-2 Small on Indirect Object Identification

cs.LG · 2026-05-21 · unverdicted · novelty 5.0

An empirical audit identifies a strong SAE feature correlate for GPT-2 small failures on 'keys' prompts in the IOI task, performs ablation and baseline controls showing it is not causal, and presents the audit pipeline as the primary contribution.

citing papers explorer

Showing 1 of 1 citing paper.

Reading Task Failure Off the Activations: A Sparse-Feature Audit of GPT-2 Small on Indirect Object Identification cs.LG · 2026-05-21 · unverdicted · none · ref 1
An empirical audit identifies a strong SAE feature correlate for GPT-2 small failures on 'keys' prompts in the IOI task, performs ablation and baseline controls showing it is not causal, and presents the audit pipeline as the primary contribution.

On the Biology of a Large Language Model,

fields

years

verdicts

representative citing papers

citing papers explorer