CADENZA in Action: Breaking the Monolith with Intent-Dependent Plan Spaces for Semantic Queries

Jaehyun Ha; Wook-Shin Han; Yongjoo Park

arxiv: 2607.01468 · v1 · pith:J6L7TGNInew · submitted 2026-07-01 · 💻 cs.DB

CADENZA in Action: Breaking the Monolith with Intent-Dependent Plan Spaces for Semantic Queries

Jaehyun Ha , Yongjoo Park , Wook-Shin Han This is my paper

Pith reviewed 2026-07-03 01:07 UTC · model grok-4.3

classification 💻 cs.DB

keywords semantic query processingquery optimizationintent decompositionplan selectionmultimodal databasesquality latency cost tradeoffssemantic operators

0 comments

The pith

CADENZA breaks monolithic semantic operators by compiling intents into decomposed steps and selecting tuned physical implementations under quality-latency-cost preferences.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces CADENZA as a semantic operator optimizer that takes a natural-language intent and compiles it into multiple decomposed steps rather than treating the entire operator as one large model. For each step the system chooses concrete physical implementations and adjusts their parameters to fit user goals on quality, speed, and cost. This addresses the limitation of existing engines that must pick between expensive full models and cheaper options that lose semantic accuracy. The demonstration lets users explore the generated plans through a web interface over multimodal databases and see how preference changes select different winners. If the approach holds, query processing can move from fixed monolithic choices to flexible, intent-specific plan spaces.

Core claim

CADENZA compiles an intent into decomposed steps, selects concrete physical implementations for each step, and tunes their parameters under user-specified quality-latency-cost preferences.

What carries the argument

Intent-dependent plan spaces that represent alternative decompositions of a semantic intent along with selectable physical implementations for each step.

If this is right

Users can supply explicit quality-latency-cost preferences and receive a different winning plan for each setting.
Optimization occurs at the granularity of individual steps instead of entire models, allowing cheaper implementations where semantics permit.
The web interface exposes how an intent is broken down and how each plan is scored and chosen.
Alternative plans are generated from the same intent so the system can compare them directly under the given preferences.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same decomposition approach could be tested on non-multimodal data to check whether the plan-space benefit generalizes beyond images and text together.
If decomposition overhead stays low, the method might be combined with traditional relational optimizers for mixed semantic and structured queries.
Repeated use on similar intents could allow caching of high-performing plan templates to reduce compilation time on later runs.

Load-bearing premise

Decomposing an intent into steps must produce plans whose quality, latency, and cost can be estimated and compared without the decomposition itself creating semantic errors or excessive overhead.

What would settle it

Run the same intent through CADENZA and a monolithic baseline on a fixed multimodal dataset and measure whether any selected plan meets the stated quality target while using less latency or cost than the monolithic version.

Figures

Figures reproduced from arXiv: 2607.01468 by Jaehyun Ha, Wook-Shin Han, Yongjoo Park.

**Figure 1.** Figure 1: Three candidate logical plans for SemJoin with “match product descriptions with product photos of the same brand.” Each node is an operator and each edge denotes the attribute consumed by the next operator. choosing an LLM, an embedding model, or a cascade—rather than exploring alternative decompositions of the operator’s task. A large LLM can ensure high accuracy yet incurs substantial cost and latency; … view at source ↗

**Figure 2.** Figure 2: System architecture of CADENZA. under the user’s quality–latency–cost preference. In this demonstration, users interact with CADENZA through an interactive web interface over multimodal databases, exploring how an intent is decomposed into alternative plans, how each plan is optimized, and how different preferences yield different winning plans. 2 System Overview This section overviews CADENZA, a standalo… view at source ↗

**Figure 3.** Figure 3: Demo walkthrough. S1 (a–d): Under Balanced preference, a SemJoin query is decomposed into three plans; Plan C [PITH_FULL_IMAGE:figures/full_fig_p003_3.png] view at source ↗

read the original abstract

Semantic query processing engines execute semantic operators, whose behavior is specified by natural-language intents, via model inference over multimodal data. Most existing optimizers optimize the operators at the granularity of monolithic implementations -- such as LLMs and embedding models -- forcing a trade-off between expensive model calls and cheaper alternatives that fail to capture intent-dependent semantics. We present CADENZA, a semantic operator optimizer that compiles an intent into decomposed steps, selects concrete physical implementations for each step, and tunes their parameters under user-specified quality-latency-cost preferences. In this demonstration, users interact with CADENZA through a web interface over multimodal databases, exploring how an intent is decomposed into alternative plans, how each plan is optimized, and how different preferences yield different winning plans.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

CADENZA is a clean system demo for decomposing semantic intents into tunable plans, but it supplies zero measurements or comparisons so the claims stay untested.

read the letter

CADENZA describes a semantic query optimizer that takes a natural-language intent, breaks it into steps, chooses physical implementations for each step, and picks the combination that matches user-set quality, latency, and cost targets. The paper's core contribution is the explicit use of intent-dependent plan spaces rather than treating each semantic operator as a single monolithic model call.

The demo interface is the part that works: users can browse alternative decompositions, see how parameters are tuned, and watch how different preferences produce different winning plans. That matches the stated goal of letting people explore the trade-offs directly.

The obvious limitation is the complete absence of any numbers. No latency figures, no quality scores, no error rates, no comparison against the monolithic baselines mentioned in the abstract. Without those, it is impossible to judge whether the decomposition step itself introduces semantic drift or whether the cost models are accurate enough to be useful. The paper is a description of implemented behavior, not an evaluation of it.

This is the kind of work that belongs in the AI-plus-databases corner of the field. Someone already building semantic query engines might pick up the plan-space framing or the interface design as a starting point. A reader looking for measured improvements or formal guarantees will find nothing to cite or build on.

I would send it to peer review. It is a straightforward systems demonstration that could become a useful reference once the authors add even modest empirical results.

Referee Report

1 major / 0 minor

Summary. The paper claims to present CADENZA, a semantic operator optimizer for semantic query processing engines that execute semantic operators specified by natural-language intents. CADENZA compiles an intent into decomposed steps, selects concrete physical implementations for each step, and tunes their parameters under user-specified quality-latency-cost preferences. The demonstration allows interaction via a web interface to explore how intents are decomposed into alternative plans, how plans are optimized, and how preferences affect winning plans over multimodal databases.

Significance. Should the system function as described, CADENZA would offer a meaningful contribution to the field by moving beyond monolithic optimizations for semantic operators, potentially allowing better trade-offs between quality, latency, and cost in multimodal database queries. The working demonstration through a web interface provides a concrete way to illustrate these concepts, which is a strength of the manuscript.

major comments (1)

Abstract: the description of the compilation and selection process assumes that intent decomposition produces alternative plans whose quality, latency, and cost can be meaningfully estimated and compared without the decomposition itself introducing semantic errors or prohibitive overhead, but the manuscript supplies no discussion, algorithm details, or evidence addressing this assumption, which is load-bearing for the central claim of effective intent-dependent plan spaces.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive review and the positive assessment of CADENZA's potential contribution. We address the single major comment below.

read point-by-point responses

Referee: Abstract: the description of the compilation and selection process assumes that intent decomposition produces alternative plans whose quality, latency, and cost can be meaningfully estimated and compared without the decomposition itself introducing semantic errors or prohibitive overhead, but the manuscript supplies no discussion, algorithm details, or evidence addressing this assumption, which is load-bearing for the central claim of effective intent-dependent plan spaces.

Authors: We agree that the manuscript, which is structured as a demonstration paper, does not supply the requested discussion, algorithm details, or evidence on semantic fidelity or overhead during intent decomposition. This is a substantive gap for the central claim. In the revised manuscript we will add a concise subsection (approximately one page) under System Design that outlines the decomposition procedure, the validation steps used to detect semantic drift, and the pruning heuristics that bound overhead; we will illustrate these with concrete examples drawn from the web-interface scenarios already shown in the demonstration. revision: yes

Circularity Check

0 steps flagged

No derivation chain present; system description only

full rationale

The paper is a system demonstration of CADENZA. It describes intent compilation into decomposed steps, physical operator selection, and preference-based tuning, but contains no equations, formal derivations, fitted parameters, predictions, or mathematical claims. The reader's assessment correctly notes the absence of any quantities that could reduce to self-referential definitions or self-citations. No load-bearing steps exist that could be evaluated for circularity under the specified patterns.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

The abstract introduces CADENZA as a new optimizer but contains no mathematical model, fitted constants, or postulated entities; the ledger is empty because the contribution is a system description rather than a derivation.

pith-pipeline@v0.9.1-grok · 5656 in / 1017 out tokens · 25418 ms · 2026-07-03T01:07:23.248067+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

12 extracted references · 3 canonical work pages

[1]

Pengcheng He, Xiaodong Liu, Jianfeng Gao, and Weizhu Chen. 2021. DeBERTa: Decoding-Enhanced BERT with Disentangled Attention. InInternational Confer- ence on Learning Representations. https://openreview.net/forum?id=XPZIaotutsD

2021
[2]

Jiale Lao, Andreas Zimmerer, Olga Ovcharenko, Tianji Cong, Matthew Russo, Gerardo Vitagliano, Michael Cochez, Fatma Özcan, Gautam Gupta, Thibaud Hottelier, et al. 2025. SemBench: A Benchmark for Semantic Query Processing Engines.arXiv preprint arXiv:2511.01716(2025)

work page arXiv 2025
[3]

Junnan Li, Dongxu Li, Caiming Xiong, and Steven Hoi. 2022. Blip: Bootstrapping language-image pre-training for unified vision-language understanding and generation. InInternational conference on machine learning. PMLR, 12888–12900

2022
[4]

Chunwei Liu, Matthew Russo, Michael Cafarella, Lei Cao, Peter Baile Chen, Zui Chen, Michael Franklin, Tim Kraska, Samuel Madden, Rana Shahout, et al. 2025. Palimpzest: Optimizing ai-powered analytics with declarative query processing. InProceedings of the Conference on Innovative Database Research (CIDR). 2

2025
[5]

Liana Patel, Siddharth Jha, Melissa Pan, Harshit Gupta, Parth Asawa, Carlos Guestrin, and Matei Zaharia. 2025. Semantic Operators and Their Optimization: Enabling LLM-Based Data Processing with Accuracy Guarantees in LOTUS. Proceedings of the VLDB Endowment18, 11 (2025), 4171–4184

2025
[6]

Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sand- hini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, et al
[7]

In International conference on machine learning

Learning transferable visual models from natural language supervision. In International conference on machine learning. PmLR, 8748–8763
[8]

Joseph Redmon, Santosh Divvala, Ross Girshick, and Ali Farhadi. 2016. You only look once: Unified, real-time object detection. InProceedings of the IEEE conference on computer vision and pattern recognition. 779–788

2016
[9]

Matthew Russo, Sivaprasad Sudhir, Gerardo Vitagliano, Chunwei Liu, Tim Kraska, Samuel Madden, and Michael Cafarella. 2025. Abacus: A Cost-Based Optimizer for Semantic Operator Systems.arXiv preprint arXiv:2505.14661(2025)

work page arXiv 2025
[10]

Shreya Shankar, Tristan Chambers, Tarak Shah, Aditya G Parameswaran, and Eugene Wu. 2024. Docetl: Agentic query rewriting and evaluation for complex document processing.arXiv preprint arXiv:2410.12189(2024)

work page arXiv 2024
[11]

Jiayi Wang and Jianhua Feng. 2025. Unify: An unstructured data analytics system. In2025 IEEE 41st International Conference on Data Engineering (ICDE). IEEE Computer Society, 4662–4674

2025
[12]

Jiayi Wang and Guoliang Li. 2025. Aop: Automated and interactive llm pipeline orchestration for answering complex queries. InProceedings of the Conference on Innovative Database Research (CIDR)

2025

[1] [1]

Pengcheng He, Xiaodong Liu, Jianfeng Gao, and Weizhu Chen. 2021. DeBERTa: Decoding-Enhanced BERT with Disentangled Attention. InInternational Confer- ence on Learning Representations. https://openreview.net/forum?id=XPZIaotutsD

2021

[2] [2]

Jiale Lao, Andreas Zimmerer, Olga Ovcharenko, Tianji Cong, Matthew Russo, Gerardo Vitagliano, Michael Cochez, Fatma Özcan, Gautam Gupta, Thibaud Hottelier, et al. 2025. SemBench: A Benchmark for Semantic Query Processing Engines.arXiv preprint arXiv:2511.01716(2025)

work page arXiv 2025

[3] [3]

Junnan Li, Dongxu Li, Caiming Xiong, and Steven Hoi. 2022. Blip: Bootstrapping language-image pre-training for unified vision-language understanding and generation. InInternational conference on machine learning. PMLR, 12888–12900

2022

[4] [4]

Chunwei Liu, Matthew Russo, Michael Cafarella, Lei Cao, Peter Baile Chen, Zui Chen, Michael Franklin, Tim Kraska, Samuel Madden, Rana Shahout, et al. 2025. Palimpzest: Optimizing ai-powered analytics with declarative query processing. InProceedings of the Conference on Innovative Database Research (CIDR). 2

2025

[5] [5]

Liana Patel, Siddharth Jha, Melissa Pan, Harshit Gupta, Parth Asawa, Carlos Guestrin, and Matei Zaharia. 2025. Semantic Operators and Their Optimization: Enabling LLM-Based Data Processing with Accuracy Guarantees in LOTUS. Proceedings of the VLDB Endowment18, 11 (2025), 4171–4184

2025

[6] [6]

Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sand- hini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, et al

[7] [7]

In International conference on machine learning

Learning transferable visual models from natural language supervision. In International conference on machine learning. PmLR, 8748–8763

[8] [8]

Joseph Redmon, Santosh Divvala, Ross Girshick, and Ali Farhadi. 2016. You only look once: Unified, real-time object detection. InProceedings of the IEEE conference on computer vision and pattern recognition. 779–788

2016

[9] [9]

Matthew Russo, Sivaprasad Sudhir, Gerardo Vitagliano, Chunwei Liu, Tim Kraska, Samuel Madden, and Michael Cafarella. 2025. Abacus: A Cost-Based Optimizer for Semantic Operator Systems.arXiv preprint arXiv:2505.14661(2025)

work page arXiv 2025

[10] [10]

Shreya Shankar, Tristan Chambers, Tarak Shah, Aditya G Parameswaran, and Eugene Wu. 2024. Docetl: Agentic query rewriting and evaluation for complex document processing.arXiv preprint arXiv:2410.12189(2024)

work page arXiv 2024

[11] [11]

Jiayi Wang and Jianhua Feng. 2025. Unify: An unstructured data analytics system. In2025 IEEE 41st International Conference on Data Engineering (ICDE). IEEE Computer Society, 4662–4674

2025

[12] [12]

Jiayi Wang and Guoliang Li. 2025. Aop: Automated and interactive llm pipeline orchestration for answering complex queries. InProceedings of the Conference on Innovative Database Research (CIDR)

2025