pith. sign in

Beyond Quacking: Deep Integration of Language Models and RAG into DuckDB

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it
abstract

Knowledge-intensive analytical applications retrieve context from both structured tabular data and unstructured, text-free documents for effective decision-making. Large language models (LLMs) have made it significantly easier to prototype such retrieval and reasoning data pipelines. However, implementing these pipelines efficiently still demands significant effort and has several challenges. This often involves orchestrating heterogeneous data systems, managing data movement, and handling low-level implementation details, e.g., LLM context management. To address these challenges, we introduce FlockMTL: an extension for DBMSs that deeply integrates LLM capabilities and retrieval-augmented generation (RAG). FlockMTL includes model-driven scalar and aggregate functions, enabling chained predictions through tuple-level mappings and reductions. Drawing inspiration from the relational model, FlockMTL incorporates: (i) cost-based optimizations, which seamlessly apply techniques such as batching and caching; and (ii) resource independence, enabled through novel SQL DDL abstractions: PROMPT and MODEL, introduced as first-class schema objects alongside TABLE. FlockMTL streamlines the development of knowledge-intensive analytical applications, and its optimizations ease the implementation burden.

citation-role summary

background 1

citation-polarity summary

fields

cs.DB 2

years

2026 2

roles

background 1

polarities

background 1

representative citing papers

PLOP: Cost-Based Placement of Semantic Operators in Hybrid Query Plans

cs.DB · 2026-04-10 · conditional · novelty 7.0

PLOP is a cost-based optimizer that finds optimal placements for semantic LLM operators in hybrid query plans via dynamic programming, delivering up to 1.5x speedup and 4.29x cost reduction on 44 benchmark queries while preserving accuracy.

LLM+Graph@VLDB'2025 Workshop Summary

cs.DB · 2026-04-03 · unverdicted · novelty 1.0

The report summarizes key research directions, challenges, and solutions from the LLM+Graph workshop at VLDB 2025.

citing papers explorer

Showing 2 of 2 citing papers.

  • PLOP: Cost-Based Placement of Semantic Operators in Hybrid Query Plans cs.DB · 2026-04-10 · conditional · none · ref 6 · internal anchor

    PLOP is a cost-based optimizer that finds optimal placements for semantic LLM operators in hybrid query plans via dynamic programming, delivering up to 1.5x speedup and 4.29x cost reduction on 44 benchmark queries while preserving accuracy.

  • LLM+Graph@VLDB'2025 Workshop Summary cs.DB · 2026-04-03 · unverdicted · none · ref 10 · internal anchor

    The report summarizes key research directions, challenges, and solutions from the LLM+Graph workshop at VLDB 2025.