Title resolution pending

· 2024 · arXiv 2402.07321

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

Title metadata for this work has not finished resolving. The hub is built from the citation graph; the title resolver retries DOI and OpenAlex on its next pass.

representative citing papers

Data-driven Circuit Discovery for Interpretability of Language Models

cs.AI · 2026-05-09 · unverdicted · novelty 7.0

Standard circuit discovery methods produce dataset-specific circuits rather than task-general ones, and a new clustering-based method discovers multiple more faithful circuits per dataset.

How Language Models Process Negation

cs.CL · 2026-05-04 · unverdicted · novelty 7.0

LLMs implement both attention-based suppression and constructive representations for negation, with construction dominant, despite poor accuracy from late-layer attention shortcuts.

Inside-Out: Measuring Generalization in Vision Transformers Through Inner Workings

cs.LG · 2026-04-09 · unverdicted · novelty 6.0

Circuit-based metrics from Vision Transformer internals provide better label-free proxies for generalization under distribution shift than existing methods like model confidence.

Tracing Relational Knowledge Recall in Large Language Models

cs.CL · 2026-04-21 · unverdicted · novelty 5.0

Per-head attention contributions to the residual stream serve as strong linear features for classifying relational knowledge in LLMs, with probe accuracy correlating to relation specificity and signal distribution.

citing papers explorer

Showing 4 of 4 citing papers.

Data-driven Circuit Discovery for Interpretability of Language Models cs.AI · 2026-05-09 · unverdicted · none · ref 2
Standard circuit discovery methods produce dataset-specific circuits rather than task-general ones, and a new clustering-based method discovers multiple more faithful circuits per dataset.
How Language Models Process Negation cs.CL · 2026-05-04 · unverdicted · none · ref 52
LLMs implement both attention-based suppression and constructive representations for negation, with construction dominant, despite poor accuracy from late-layer attention shortcuts.
Inside-Out: Measuring Generalization in Vision Transformers Through Inner Workings cs.LG · 2026-04-09 · unverdicted · none · ref 8
Circuit-based metrics from Vision Transformer internals provide better label-free proxies for generalization under distribution shift than existing methods like model confidence.
Tracing Relational Knowledge Recall in Large Language Models cs.CL · 2026-04-21 · unverdicted · none · ref 3
Per-head attention contributions to the residual stream serve as strong linear features for classifying relational knowledge in LLMs, with probe accuracy correlating to relation specificity and signal distribution.

Title resolution pending

fields

years

verdicts

representative citing papers

citing papers explorer