pith. sign in

arXiv preprint arXiv:2403.02181 , year=

6 Pith papers cite this work. Polarity classification is still indexing.

6 Pith papers citing it

citation-role summary

method 1

citation-polarity summary

years

2026 5 2025 1

verdicts

UNVERDICTED 6

roles

method 1

polarities

use method 1

clear filters

representative citing papers

Two-dimensional early exit optimisation of LLM inference

cs.CL · 2026-03-27 · unverdicted · novelty 7.0

Coordinating layer-wise and sentence-wise early exits in LLMs produces multiplicative speedups of 1.4-2.3x over single-dimension early exit on sentiment classification tasks.

Tracing Computation Density in LLMs

cs.CL · 2026-05-26 · unverdicted · novelty 6.0

LLM computation follows a consistent two-phase pattern: a sparse early-layer core reconstructs the head of the output distribution, with later layers and attention heads providing incremental refinements that correlate with model uncertainty.

Uncovering the Latent Potential of Deep Intermediate Representations

cs.LG · 2026-05-21 · unverdicted · novelty 6.0

Introduces LOES, a constructive spectral method to select task-discriminative subspaces from intermediate layer embeddings, and GeoReg for enforcing simplicial class geometry during fine-tuning, with reported gains increasing with model depth across modalities.

citing papers explorer

Showing 6 of 6 citing papers after filters.

  • FASER: Fine-Grained Phase Management for Speculative Decoding in Dynamic LLM Serving cs.DC · 2026-04-22 · unverdicted · none · ref 13

    FASER delivers up to 53% higher throughput and 1.92x lower latency in dynamic LLM serving by adjusting speculative lengths per request, early pruning of rejects, and overlapping draft/verification phases via frontiers.

  • Two-dimensional early exit optimisation of LLM inference cs.CL · 2026-03-27 · unverdicted · none · ref 7

    Coordinating layer-wise and sentence-wise early exits in LLMs produces multiplicative speedups of 1.4-2.3x over single-dimension early exit on sentiment classification tasks.

  • Tracing Computation Density in LLMs cs.CL · 2026-05-26 · unverdicted · none · ref 1

    LLM computation follows a consistent two-phase pattern: a sparse early-layer core reconstructs the head of the output distribution, with later layers and attention heads providing incremental refinements that correlate with model uncertainty.

  • Uncovering the Latent Potential of Deep Intermediate Representations cs.LG · 2026-05-21 · unverdicted · none · ref 43

    Introduces LOES, a constructive spectral method to select task-discriminative subspaces from intermediate layer embeddings, and GeoReg for enforcing simplicial class geometry during fine-tuning, with reported gains increasing with model depth across modalities.

  • The Generalization Ridge: Information Flow in Natural Language Generation cs.CL · 2025-07-07 · unverdicted · none · ref 8

    InfoRidge reveals a non-monotonic pattern in which predictive mutual information between hidden states and outputs peaks in intermediate layers before declining in final layers.

  • ART: Attention Replacement Technique to Improve Factuality in LLMs cs.CL · 2026-04-07 · unverdicted · none · ref 3

    ART replaces uniform attention in shallow LLM layers with local attention patterns to reduce hallucinations across multiple model architectures.