Presto is extended to GPU-aware execution using cuDF experiments on TPC-H, delivering up to 6x cost/performance gains over CPU Presto via optimized data paths and inter-operator communication.
Title resolution pending
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.DB 2years
2026 2verdicts
UNVERDICTED 2representative citing papers
Ring-buffer shuffle achieves amortized O(1) synchronization per batch and O(M) memory use, outperforming prior methods by up to 300% on 192-core systems in query engine benchmarks.
citing papers explorer
-
Accelerating Presto with GPUs
Presto is extended to GPU-aware execution using cuDF experiments on TPC-H, delivering up to 6x cost/performance gains over CPU Presto via optimized data paths and inter-operator communication.
-
One Ring to Shuffle Them All: Scalable Intra-Process Data Redistribution with Ring-Buffer Shuffle in Redpanda Oxla
Ring-buffer shuffle achieves amortized O(1) synchronization per batch and O(M) memory use, outperforming prior methods by up to 300% on 192-core systems in query engine benchmarks.