arXiv preprint arXiv:2403.02181 , year=

Towards automated circuit discovery for mechanistic interpretability · 2026 · arXiv 2403.02181

6 Pith papers cite this work. Polarity classification is still indexing.

6 Pith papers citing it

read on arXiv browse 6 citing papers

citation-role summary

method 1

citation-polarity summary

use method 1

representative citing papers

FASER: Fine-Grained Phase Management for Speculative Decoding in Dynamic LLM Serving

cs.DC · 2026-04-22 · unverdicted · novelty 7.0

FASER delivers up to 53% higher throughput and 1.92x lower latency in dynamic LLM serving by adjusting speculative lengths per request, early pruning of rejects, and overlapping draft/verification phases via frontiers.

Two-dimensional early exit optimisation of LLM inference

cs.CL · 2026-03-27 · unverdicted · novelty 7.0

Coordinating layer-wise and sentence-wise early exits in LLMs produces multiplicative speedups of 1.4-2.3x over single-dimension early exit on sentiment classification tasks.

Tracing Computation Density in LLMs

cs.CL · 2026-05-26 · unverdicted · novelty 6.0

LLM computation follows a consistent two-phase pattern: a sparse early-layer core reconstructs the head of the output distribution, with later layers and attention heads providing incremental refinements that correlate with model uncertainty.

Uncovering the Latent Potential of Deep Intermediate Representations

cs.LG · 2026-05-21 · unverdicted · novelty 6.0

Introduces LOES, a constructive spectral method to select task-discriminative subspaces from intermediate layer embeddings, and GeoReg for enforcing simplicial class geometry during fine-tuning, with reported gains increasing with model depth across modalities.

The Generalization Ridge: Information Flow in Natural Language Generation

cs.CL · 2025-07-07 · unverdicted · novelty 6.0

InfoRidge reveals a non-monotonic pattern in which predictive mutual information between hidden states and outputs peaks in intermediate layers before declining in final layers.

ART: Attention Replacement Technique to Improve Factuality in LLMs

cs.CL · 2026-04-07 · unverdicted · novelty 5.0

ART replaces uniform attention in shallow LLM layers with local attention patterns to reduce hallucinations across multiple model architectures.

citing papers explorer

Showing 6 of 6 citing papers after filters.

FASER: Fine-Grained Phase Management for Speculative Decoding in Dynamic LLM Serving cs.DC · 2026-04-22 · unverdicted · none · ref 13
FASER delivers up to 53% higher throughput and 1.92x lower latency in dynamic LLM serving by adjusting speculative lengths per request, early pruning of rejects, and overlapping draft/verification phases via frontiers.
Two-dimensional early exit optimisation of LLM inference cs.CL · 2026-03-27 · unverdicted · none · ref 7
Coordinating layer-wise and sentence-wise early exits in LLMs produces multiplicative speedups of 1.4-2.3x over single-dimension early exit on sentiment classification tasks.
Tracing Computation Density in LLMs cs.CL · 2026-05-26 · unverdicted · none · ref 1
LLM computation follows a consistent two-phase pattern: a sparse early-layer core reconstructs the head of the output distribution, with later layers and attention heads providing incremental refinements that correlate with model uncertainty.
Uncovering the Latent Potential of Deep Intermediate Representations cs.LG · 2026-05-21 · unverdicted · none · ref 43
Introduces LOES, a constructive spectral method to select task-discriminative subspaces from intermediate layer embeddings, and GeoReg for enforcing simplicial class geometry during fine-tuning, with reported gains increasing with model depth across modalities.
The Generalization Ridge: Information Flow in Natural Language Generation cs.CL · 2025-07-07 · unverdicted · none · ref 8
InfoRidge reveals a non-monotonic pattern in which predictive mutual information between hidden states and outputs peaks in intermediate layers before declining in final layers.
ART: Attention Replacement Technique to Improve Factuality in LLMs cs.CL · 2026-04-07 · unverdicted · none · ref 3
ART replaces uniform attention in shallow LLM layers with local attention patterns to reduce hallucinations across multiple model architectures.

arXiv preprint arXiv:2403.02181 , year=

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer