PLOP: Cost-Based Placement of Semantic Operators in Hybrid Query Plans

· 2026 · cs.DB · arXiv 2604.09944

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

open full Pith review browse 1 citing papers arXiv PDF

abstract

Recent database systems have introduced semantic operators that leverage large language models (LLMs) to filter, join, and project over structured data using natural language predicates. In practice, these operators are combined with traditional relational operators, e.g., equi-joins, producing hybrid query plans whose execution cost depends on both expensive LLM calls and conventional database processing. A key optimization question is where to place each semantic operator relative to the relational operators in the plan: placing them earlier reduces the data that subsequent operators process, but requires more LLM calls; placing them later reduces LLM calls through deduplication, but forces relational operators to process larger intermediate data. Existing systems either ignore this placement question or apply simple heuristics without considering the full cost trade-off. We present PLOP, a plan-level optimizer for hybrid semantic-relational queries. PLOP reduces hybrid query planning to semantic filter placement via two equivalence-preserving rewrites. We prove that deferring all semantic filters to the latest possible position minimizes LLM invocations under function caching, but show that this can cause relational processing costs to dominate on complex multi-table queries. To balance LLM cost against relational cost, PLOP uses a dynamic-programming-based cost model that finds the placement minimizing their weighted sum. On 44 semantic SQL queries across five schemas and two benchmarks, PLOP achieves up to 1.5$\times$ speedup and 4.29$\times$ cost reduction while maintaining high output quality: an average F1 of 0.85 against the unoptimized baseline and 0.84 against human-annotated ground truth on SemBench. Overall, PLOP achieves a significant cost reduction while preserving the highest accuracy among six publicly available systems.

representative citing papers

Large Databases Need Small, Open-Weight Language Models

cs.AI · 2026-06-30 · unverdicted · novelty 4.0

Quantized open-weight LMs on consumer hardware match closed-source API accuracy for LM-enhanced relational operators while delivering 390x lower cost and 3.8x lower latency in the BlendSQL framework.

citing papers explorer

Showing 1 of 1 citing paper after filters.

Large Databases Need Small, Open-Weight Language Models cs.AI · 2026-06-30 · unverdicted · none · ref 37 · internal anchor
Quantized open-weight LMs on consumer hardware match closed-source API accuracy for LM-enhanced relational operators while delivering 390x lower cost and 3.8x lower latency in the BlendSQL framework.

PLOP: Cost-Based Placement of Semantic Operators in Hybrid Query Plans

fields

years

verdicts

representative citing papers

citing papers explorer