Designing and interpreting probes with control tasks

10 John Hewitt, Percy Liang · 2019

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

H-Probes: Extracting Hierarchical Structures From Latent Representations of Language Models

cs.CL · 2026-04-15 · unverdicted · novelty 5.0

H-probes locate low-dimensional subspaces encoding hierarchy in LLM activations for synthetic tree tasks, show causal importance and generalization, and detect weaker signals in mathematical reasoning traces.

citing papers explorer

Showing 1 of 1 citing paper.

H-Probes: Extracting Hierarchical Structures From Latent Representations of Language Models cs.CL · 2026-04-15 · unverdicted · none · ref 9
H-probes locate low-dimensional subspaces encoding hierarchy in LLM activations for synthetic tree tasks, show causal importance and generalization, and detect weaker signals in mathematical reasoning traces.

Designing and interpreting probes with control tasks

fields

years

verdicts

representative citing papers

citing papers explorer