Layerskip: Enabling early exit inference and self-speculative decoding, August 2024

Mostafa Elhoushi, Akshat Shrivastava, Diana Liskovich, Basil Hosmer, Bram Wasti, Liangzhen Lai, Anas Mahmoud, Bilge Acun, Saurabh Agarwal, Ahmed Roman, Ahmed A Aly, Beidi Chen, Carole-Jean Wu · 2024

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

LogitSpec: Accelerating Retrieval-based Speculative Decoding via Next Next Token Speculation

cs.CL · 2025-07-02 · unverdicted · novelty 5.0

LogitSpec accelerates retrieval-based speculative decoding by speculating the next-next token from the last logit and retrieving relevant references for both next and next-next tokens, reporting up to 2.61x speedup and 3.28 mean accepted tokens.

citing papers explorer

Showing 1 of 1 citing paper.

LogitSpec: Accelerating Retrieval-based Speculative Decoding via Next Next Token Speculation cs.CL · 2025-07-02 · unverdicted · none · ref 22
LogitSpec accelerates retrieval-based speculative decoding by speculating the next-next token from the last logit and retrieving relevant references for both next and next-next tokens, reporting up to 2.61x speedup and 3.28 mean accepted tokens.

Layerskip: Enabling early exit inference and self-speculative decoding, August 2024

fields

years

verdicts

representative citing papers

citing papers explorer