Zoom in: An introduction to circuits

Chris Olah, Nick Cammarata, Ludwig Schubert, Gabriel Goh, Michael Petrov, Shan Carter · 2020

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

browse 2 citing papers

representative citing papers

Localizing Model Behavior with Path Patching

cs.LG · 2023-04-12 · unverdicted · novelty 8.0

Path patching provides a method to express and quantitatively test hypotheses that neural network behaviors are localized to sets of paths.

Sparse Autoencoders Find Highly Interpretable Features in Language Models

cs.LG · 2023-09-15 · unverdicted · novelty 6.0

Sparse autoencoders applied to language model activations yield more interpretable and monosemantic features than alternative approaches, enabling finer causal analysis on the indirect object identification task.

citing papers explorer

Showing 2 of 2 citing papers.

Localizing Model Behavior with Path Patching cs.LG · 2023-04-12 · unverdicted · none · ref 46
Path patching provides a method to express and quantitatively test hypotheses that neural network behaviors are localized to sets of paths.
Sparse Autoencoders Find Highly Interpretable Features in Language Models cs.LG · 2023-09-15 · unverdicted · none · ref 21
Sparse autoencoders applied to language model activations yield more interpretable and monosemantic features than alternative approaches, enabling finer causal analysis on the indirect object identification task.

Zoom in: An introduction to circuits

fields

years

verdicts

representative citing papers

citing papers explorer