https://www.perplexity.ai/

Perplexity ai

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

SARATHI: Efficient LLM Inference by Piggybacking Decodes with Chunked Prefills

cs.LG · 2023-08-31 · unverdicted · novelty 6.0

SARATHI uses chunked prefills and decode-maximal batching to let decode steps ride along with prefill compute, delivering up to 10x higher decode throughput and 1.91x end-to-end throughput on models including LLaMA-13B and GPT-3.

citing papers explorer

Showing 1 of 1 citing paper.

SARATHI: Efficient LLM Inference by Piggybacking Decodes with Chunked Prefills cs.LG · 2023-08-31 · unverdicted · none · ref 15
SARATHI uses chunked prefills and decode-maximal batching to let decode steps ride along with prefill compute, delivering up to 10x higher decode throughput and 1.91x end-to-end throughput on models including LLaMA-13B and GPT-3.

https://www.perplexity.ai/

fields

years

verdicts

representative citing papers

citing papers explorer