Beyond Quacking: Deep Integration of Language Models and RAG into DuckDB
Pith reviewed 2026-05-22 22:29 UTC · model grok-4.3
The pith
FlockMTL embeds LLM calls and RAG directly into DuckDB through model-driven SQL functions and new PROMPT and MODEL schema objects.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
FlockMTL extends a DBMS with model-driven scalar and aggregate functions for chained LLM predictions on tuples, together with PROMPT and MODEL as first-class schema objects that sit alongside TABLE; these abstractions permit cost-based optimizations including batching and caching while providing resource independence, allowing SQL queries to handle both structured data and retrieval-augmented generation without external orchestration.
What carries the argument
Model-driven scalar and aggregate functions plus PROMPT and MODEL DDL objects treated as first-class schema elements, which carry LLM integration into query execution and optimization.
If this is right
- SQL queries can express tuple-level LLM mappings and reductions without leaving the DBMS.
- Batching and caching of LLM calls occur automatically through existing cost-based mechanisms.
- Data movement between database and language-model services is eliminated for these workloads.
- Development of applications that mix tabular retrieval with document-based reasoning becomes a single SQL task.
Where Pith is reading between the lines
- The same DDL abstractions could be adopted by other relational engines to achieve similar integration.
- Workload-specific cost models for LLM latency and token usage might emerge as natural extensions of the optimizer.
- Applications that already run inside DuckDB could gain new capabilities for unstructured data without pipeline changes.
Load-bearing premise
The assumption that embedding LLM calls as model-driven functions plus new PROMPT and MODEL DDL objects will meaningfully reduce orchestration effort and data movement compared with existing heterogeneous pipelines.
What would settle it
A side-by-side comparison of lines of code, development time, and end-to-end latency for the same knowledge-intensive analytical task written once with FlockMTL and once with separate database and LLM systems.
read the original abstract
Knowledge-intensive analytical applications retrieve context from both structured tabular data and unstructured, text-free documents for effective decision-making. Large language models (LLMs) have made it significantly easier to prototype such retrieval and reasoning data pipelines. However, implementing these pipelines efficiently still demands significant effort and has several challenges. This often involves orchestrating heterogeneous data systems, managing data movement, and handling low-level implementation details, e.g., LLM context management. To address these challenges, we introduce FlockMTL: an extension for DBMSs that deeply integrates LLM capabilities and retrieval-augmented generation (RAG). FlockMTL includes model-driven scalar and aggregate functions, enabling chained predictions through tuple-level mappings and reductions. Drawing inspiration from the relational model, FlockMTL incorporates: (i) cost-based optimizations, which seamlessly apply techniques such as batching and caching; and (ii) resource independence, enabled through novel SQL DDL abstractions: PROMPT and MODEL, introduced as first-class schema objects alongside TABLE. FlockMTL streamlines the development of knowledge-intensive analytical applications, and its optimizations ease the implementation burden.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces FlockMTL as a DuckDB extension for deep integration of LLMs and RAG into a DBMS. It proposes model-driven scalar and aggregate functions to enable tuple-level mappings and reductions for chained LLM predictions, along with first-class PROMPT and MODEL DDL objects (modeled after TABLE) to support resource independence. The system incorporates cost-based optimizations such as batching and caching, with the central claim that this approach reduces orchestration effort, data movement, and low-level implementation details relative to heterogeneous pipelines for knowledge-intensive analytical applications.
Significance. If substantiated, the approach could lower barriers for building applications that combine structured tabular data with unstructured documents via LLMs inside a single DBMS. The conceptual framing around relational-model-inspired DDL abstractions and optimizer integration is a plausible direction for reducing pipeline complexity. However, the manuscript supplies no implementation details, benchmarks, user studies, or controlled comparisons, so the claimed reductions in burden cannot be evaluated and the significance remains speculative.
major comments (2)
- [Abstract] Abstract: the claim that 'FlockMTL streamlines the development of knowledge-intensive analytical applications, and its optimizations ease the implementation burden' is presented as the outcome of the work, yet the manuscript contains no evaluation results, metrics on orchestration effort, data-movement measurements, or comparisons against heterogeneous pipelines to support this assertion.
- [Abstract] Abstract: the description of 'model-driven scalar and aggregate functions' and 'novel SQL DDL abstractions: PROMPT and MODEL' is given at a high level only, with no specification of their semantics, how they interact with the query optimizer, or how cost-based decisions for batching/caching are realized; this absence prevents assessment of whether the proposed deep integration is technically feasible or load-bearing for the central claim.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback highlighting issues with the abstract's claims and level of technical detail. We agree that both points require attention and will revise the manuscript accordingly to better align the presentation with the content provided.
read point-by-point responses
-
Referee: [Abstract] Abstract: the claim that 'FlockMTL streamlines the development of knowledge-intensive analytical applications, and its optimizations ease the implementation burden' is presented as the outcome of the work, yet the manuscript contains no evaluation results, metrics on orchestration effort, data-movement measurements, or comparisons against heterogeneous pipelines to support this assertion.
Authors: We agree that the abstract overstates these benefits as demonstrated outcomes. The current manuscript is a system design and architecture paper without empirical evaluations. We will revise the abstract to present the streamlining and burden reduction as the intended benefits of the proposed design and optimizations, rather than as measured results. We will also add a dedicated discussion section outlining qualitative arguments for these benefits based on the architecture. revision: yes
-
Referee: [Abstract] Abstract: the description of 'model-driven scalar and aggregate functions' and 'novel SQL DDL abstractions: PROMPT and MODEL' is given at a high level only, with no specification of their semantics, how they interact with the query optimizer, or how cost-based decisions for batching/caching are realized; this absence prevents assessment of whether the proposed deep integration is technically feasible or load-bearing for the central claim.
Authors: The body of the manuscript provides usage examples and high-level descriptions of the functions and DDL objects, along with mentions of optimizer extensions for batching and caching. However, we acknowledge that the abstract is overly terse and that more precise semantics and interaction details would aid assessment. We will revise the abstract to include brief semantic highlights and ensure the relevant sections (on function definitions, DDL modeling, and cost-based planning) are expanded with additional specification of semantics and optimizer integration in the revised version. revision: yes
Circularity Check
No significant circularity
full rationale
The paper proposes a system design (FlockMTL) for embedding LLM/RAG capabilities into DuckDB via scalar/aggregate functions and new PROMPT/MODEL DDL objects. No mathematical derivations, equations, fitted parameters, or predictions appear in the provided material. Claims about reduced orchestration effort are design assertions rather than results derived from inputs by construction. No self-citation chains, uniqueness theorems, or ansatzes are load-bearing. This matches the default expectation for non-circular system papers.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption LLM inference can be exposed as scalar and aggregate functions that support tuple-level mappings and reductions without breaking relational semantics.
invented entities (1)
-
PROMPT and MODEL DDL objects
no independent evidence
Forward citations
Cited by 2 Pith papers
-
PLOP: Cost-Based Placement of Semantic Operators in Hybrid Query Plans
PLOP is a cost-based optimizer that finds optimal placements for semantic LLM operators in hybrid query plans via dynamic programming, delivering up to 1.5x speedup and 4.29x cost reduction on 44 benchmark queries whi...
-
LLM+Graph@VLDB'2025 Workshop Summary
The report summarizes key research directions, challenges, and solutions from the LLM+Graph workshop at VLDB 2025.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.