Adaptive computation time for transformers via early-exit mechanisms

J Xin, Y Song, L Cao, D Yu · 2023

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

Beyond Exponential Decay: Rethinking Error Accumulation in Large Language Models

cs.CL · 2025-05-30 · unverdicted · novelty 5.0

LLM errors concentrate in sparse key tokens (5-10% of sequence) at semantic decision junctions, yielding a new reliability model that explains sustained long-context coherence.

citing papers explorer

Showing 1 of 1 citing paper.

Beyond Exponential Decay: Rethinking Error Accumulation in Large Language Models cs.CL · 2025-05-30 · unverdicted · none · ref 15
LLM errors concentrate in sparse key tokens (5-10% of sequence) at semantic decision junctions, yielding a new reliability model that explains sustained long-context coherence.

Adaptive computation time for transformers via early-exit mechanisms

fields

years

verdicts

representative citing papers

citing papers explorer