arXiv , url =:2211.12312 , primaryclass =

Interpreting neural networks through the polytope lens , author= · arXiv 2211.12312

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

representative citing papers

All Circuits Lead to Rome: Rethinking Functional Anisotropy in Circuit and Sheaf Discovery for LLMs

cs.CL · 2026-05-12 · unverdicted · novelty 7.0

LLM tasks are supported by multiple distinct circuits rather than unique mechanisms, demonstrated via Overlap-Aware Sheaf Repulsion and the Distributive Dense Circuit Hypothesis.

Surrogate Fidelity: When Can Open LLMs Explain Closed Ones?

cs.LG · 2026-06-30 · unverdicted · novelty 6.0

Prediction agreement between open and closed LLMs substantially overstates agreement on attributions and causal reasons.

Understanding the Mechanism of Altruism in Large Language Models

econ.GN · 2026-04-21 · unverdicted · novelty 6.0

A small set of sparse autoencoder features in LLMs drives shifts between generous and selfish allocations in dictator games, with causal patching and steering confirming their role and generalization to other social games.

citing papers explorer

Showing 1 of 1 citing paper after filters.

All Circuits Lead to Rome: Rethinking Functional Anisotropy in Circuit and Sheaf Discovery for LLMs cs.CL · 2026-05-12 · unverdicted · none · ref 86
LLM tasks are supported by multiple distinct circuits rather than unique mechanisms, demonstrated via Overlap-Aware Sheaf Repulsion and the Distributive Dense Circuit Hypothesis.

arXiv , url =:2211.12312 , primaryclass =

fields

years

verdicts

representative citing papers

citing papers explorer