pith. sign in

Title resolution pending

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

fields

cs.CL 1 cs.LG 1

years

2026 1 2022 1

representative citing papers

Fast Inference from Transformers via Speculative Decoding

cs.LG · 2022-11-30 · accept · novelty 7.0

Speculative decoding accelerates exact sampling from large autoregressive models by 2-3x on T5-XXL by running smaller approximation models in parallel to propose token sequences that the large model then verifies in batches while preserving the original output distribution.

citing papers explorer

Showing 2 of 2 citing papers.

  • Language Generation as Optimal Control: Closed-Loop Diffusion in Latent Control Space cs.CL · 2026-05-14 · unverdicted · none · ref 6 · 2 links

    The paper introduces Manta-LM, which approximates the Hamilton-Jacobi-Bellman optimal policy via Flow Matching in a rectified latent control space to enable high-fidelity parallel language generation.

  • Fast Inference from Transformers via Speculative Decoding cs.LG · 2022-11-30 · accept · none · ref 43

    Speculative decoding accelerates exact sampling from large autoregressive models by 2-3x on T5-XXL by running smaller approximation models in parallel to propose token sequences that the large model then verifies in batches while preserving the original output distribution.