Recognition: unknown
Blue Data Intelligence Layer: Streaming Data and Agents for Multi-source Multi-modal Data-Centric Applications
Pith reviewed 2026-05-10 10:47 UTC · model grok-4.3
The pith
DIL unifies enterprise data, LLM knowledge, and user context to answer natural language queries that span multiple sources and modalities.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
At the core of DIL is a data registry that stores metadata for diverse data sources and modalities. DIL treats LLMs, the Web, and the User as source 'databases' each with their own query interface. Data planners transform user queries into executable query plans that unify relational operators with other operators spanning multiple modalities, supporting decomposition, retrieval, reasoning, and integration.
What carries the argument
Data planners that convert natural language queries into declarative plans for multi-source, multi-modal execution.
Load-bearing premise
Data planners can reliably decompose complex requests, retrieve from heterogeneous sources, and integrate results across modalities without substantial errors or additional human intervention.
What would settle it
A test case involving an iterative query that requires combining enterprise database results, LLM commonsense knowledge, and user-specific context where the system fails to produce accurate integrated output.
Figures
read the original abstract
NL2SQL systems aim to address the growing need for natural language interaction with data. However, real-world information rarely maps to a single SQL query because (1) users express queries iteratively (2) questions often span multiple data sources beyond the closed-world assumption of a single database, and (3) queries frequently rely on commonsense or external knowledge. Consequently, satisfying realistic data needs require integrating heterogeneous sources, modalities, and contextual data. In this paper, we present Blue's Data Intelligence Layer (DIL) designed to support multi-source, multi-modal, and data-centric applications. Blue is a compound AI system that orchestrates agents and data for enterprise settings. DIL serves as the data intelligence layer for agentic data processing, to bridge the semantic gap between user intent and available information by unifying structured enterprise data, world knowledge accessible through LLMs, and personal context obtained through interaction. At the core of DIL is a data registry that stores metadata for diverse data sources and modalities to enable both native and natural language queries. DIL treats LLMs, the Web, and the User as source 'databases', each with their own query interface, elevating them to first-class data sources. DIL relies on data planners to transform user queries into executable query plans. These plans are declarative abstractions that unify relational operators with other operators spanning multiple modalities. DIL planners support decomposition of complex requests into subqueries, retrieval from diverse sources, and finally reasoning and integration to produce final results. We demonstrate DIL through two interactive scenarios in which user queries dynamically trigger multi-source retrieval, cross-modal reasoning, and result synthesis, illustrating how compound AI systems can move beyond single database NL2SQL.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper presents Blue's Data Intelligence Layer (DIL) as a component of a compound AI system for enterprise data-centric applications. It argues that NL2SQL systems fall short for iterative, multi-source queries requiring external knowledge, and proposes DIL to bridge the semantic gap by unifying structured enterprise data with LLM-accessible world knowledge and user context. Key elements include a data registry storing metadata for heterogeneous sources and modalities, treating LLMs, the Web, and users as first-class queryable sources, and data planners that convert natural language queries into declarative plans unifying relational and multi-modal operators. These planners enable decomposition into subqueries, retrieval, reasoning, and result integration. The approach is illustrated with two interactive scenarios demonstrating dynamic multi-source retrieval and cross-modal synthesis.
Significance. If the data planners prove reliable, DIL could advance compound AI systems by providing an architectural framework for agentic data processing that integrates diverse sources beyond closed-world databases. The elevation of LLMs, web, and user interactions to queryable entities offers a conceptual contribution to multi-modal data access, potentially enabling more flexible natural language interactions in enterprise settings. The manuscript supplies a coherent high-level vision and illustrative examples, which clarify the intended unification of operators, though empirical validation would be needed to realize this significance.
major comments (2)
- [Abstract and data planners section] Abstract and data planners description: The central claim that DIL bridges the semantic gap by unifying sources and enabling reliable decomposition/retrieval/integration rests on the data planners' capabilities. However, the manuscript gives only a high-level overview of transforming queries into declarative plans without specifying the planning algorithm, how relational and multi-modal operators are unified, or mechanisms for error handling in cross-source reasoning. This directly impacts the weakest assumption (planner reliability across heterogeneous sources) and leaves the claim unevaluable.
- [Interactive scenarios section] Interactive scenarios section: The two scenarios are offered as demonstrations of multi-source retrieval, cross-modal reasoning, and result synthesis. Yet they contain no quantitative metrics (e.g., success rates, accuracy, latency), error analysis, ablation studies, or baselines, making it impossible to assess whether the planners support realistic queries without substantial human intervention. This is load-bearing for the paper's positioning of DIL as a practical solution beyond NL2SQL.
minor comments (2)
- The title references streaming data and agents, but the abstract and core description emphasize query planning and scenarios without detailing streaming mechanisms or agent orchestration; clarify this aspect for consistency.
- The manuscript would benefit from additional references to prior work on multi-modal agents, data integration frameworks, and NL2SQL extensions to better contextualize the novelty of treating LLMs/Web/User as first-class sources.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback, which helps clarify the intended scope of our work as a high-level architectural vision for the Data Intelligence Layer (DIL) within compound AI systems. We address each major comment below, focusing on the conceptual contributions while acknowledging areas where additional clarification can strengthen the manuscript.
read point-by-point responses
-
Referee: [Abstract and data planners section] Abstract and data planners description: The central claim that DIL bridges the semantic gap by unifying sources and enabling reliable decomposition/retrieval/integration rests on the data planners' capabilities. However, the manuscript gives only a high-level overview of transforming queries into declarative plans without specifying the planning algorithm, how relational and multi-modal operators are unified, or mechanisms for error handling in cross-source reasoning. This directly impacts the weakest assumption (planner reliability across heterogeneous sources) and leaves the claim unevaluable.
Authors: We agree that the manuscript presents the data planners at a conceptual level without a specific algorithm, detailed unification mechanics, or error-handling protocols. This reflects the paper's focus on an architectural framework that elevates LLMs, the Web, and users as first-class sources via the data registry, with declarative plans serving as abstractions to unify relational and multi-modal operators. The planners enable decomposition, retrieval, and integration as described, but concrete planning algorithms and reliability mechanisms remain implementation details for future work. We will revise the data planners section to include expanded examples of declarative plan structures, operator unification through the registry metadata, and a discussion of high-level error-handling strategies (e.g., via agentic reasoning loops), making the conceptual claims more concrete without claiming empirical reliability. revision: partial
-
Referee: [Interactive scenarios section] Interactive scenarios section: The two scenarios are offered as demonstrations of multi-source retrieval, cross-modal reasoning, and result synthesis. Yet they contain no quantitative metrics (e.g., success rates, accuracy, latency), error analysis, ablation studies, or baselines, making it impossible to assess whether the planners support realistic queries without substantial human intervention. This is load-bearing for the paper's positioning of DIL as a practical solution beyond NL2SQL.
Authors: The scenarios function as illustrative demonstrations of DIL's dynamic capabilities in multi-source, multi-modal settings, consistent with the paper's positioning as a vision for agentic data processing rather than a performance evaluation. No quantitative metrics, error analyses, or baselines are included because the work does not claim or evaluate a deployed planner implementation. We acknowledge this limits direct assessment of practicality and human intervention needs. In revision, we will add an explicit limitations subsection and a forward-looking discussion of planned empirical studies (including metrics and baselines) to better contextualize the examples and address the concern about positioning DIL as a practical advance over NL2SQL. revision: partial
Circularity Check
No circularity: purely descriptive system architecture with no derivations or self-referential claims
full rationale
The paper provides an architectural description of the Data Intelligence Layer (DIL), its data registry, treatment of LLMs/Web/User as first-class sources, and data planners for query decomposition and multi-modal integration. It contains no equations, fitted parameters, predictions, uniqueness theorems, or derivation chains of any kind. Claims about bridging semantic gaps and supporting agentic processing are presented as design goals illustrated by two scenarios, without any reduction to inputs by construction, self-citation load-bearing premises, or renamed empirical patterns. The system is self-contained as a high-level overview and does not invoke external results that collapse back into its own definitions.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption LLMs can serve as reliable sources of world knowledge and commonsense for query answering
- domain assumption User interaction history provides usable personal context for query personalization
invented entities (3)
-
Data Intelligence Layer (DIL)
no independent evidence
-
Data registry
no independent evidence
-
Data planners
no independent evidence
Reference graph
Works this paper leans on
-
[1]
Ashwin Alaparthi, Paul Loh, and Ryan Marcus. 2025. ScaleLLM: A Technique for Scalable LLM-augmented Data Systems. InCompanion of the 2025 International Conference on Management of Data(Berlin, Germany)(SIGMOD/PODS ’25). As- sociation for Computing Machinery, New York, NY, USA, 11–14. doi:10.1145/ 3722212.3725130
-
[2]
Muhammad Imam Luthfi Balaka, David Alexander, Qiming Wang, Yue Gong, Adila Krisnadhi, and Raul Castro Fernandez. 2025. Pneuma: Leveraging LLMs for Tabular Data Representation and Retrieval in an End-to-End System.Proc. ACM Manag. Data3, 3, Article 200 (June 2025), 28 pages. doi:10.1145/3725337
- [3]
-
[4]
Chuxuan Hu, Maxwell Yang, James Weiland, Yeji Lim, Suhas Palawala, and Daniel Kang. 2025. Drama: Unifying Data Retrieval and Analysis for Open-Domain Analytic Queries.Proc. ACM Manag. Data3, 6, Article 316 (Dec. 2025), 28 pages. doi:10.1145/3769781
-
[5]
Eser Kandogan, Nikita Bhutani, Dan Zhang, Rafael Li Chen, Sairam Gurajada, and Estevam Hruschka. 2025. Orchestrating Agents and Data for Enterprise: A Blueprint Architecture for Compound AI. In2025 IEEE 41st International Confer- ence on Data Engineering Workshops (ICDEW). 18–27. doi:10.1109/ICDEW67478. 2025.00007
-
[6]
Rohit Khoja, Devanshu Gupta, Yanjie Fu, Dan Roth, and Vivek Gupta. 2025. Weaver: Interweaving SQL and LLM for Table Reasoning. InProceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, and Violet Peng (Eds.). Association for Computational Linguistics, Suzhou, Ch...
2025
-
[7]
Lee, Justin Chan, Michael Fu, Nicolas Kim, Akshay Mehta, Deepti Raghavan, and Uğur Çetintemel
Alexander W. Lee, Justin Chan, Michael Fu, Nicolas Kim, Akshay Mehta, Deepti Raghavan, and Uğur Çetintemel. 2025. Semantic Integrity Constraints: Declara- tive Guardrails for AI-Augmented Data Processing Systems.Proceedings of the VLDB Endowment18, 11 (July 2025), 4073–4080. doi:10.14778/3749646.3749677
-
[8]
Boyan Li, Yuyu Luo, Chengliang Chai, Guoliang Li, and Nan Tang. 2024. The Dawn of Natural Language to SQL: Are We Fully Ready?Proceedings of the VLDB Endowment17, 11 (July 2024), 3318–3331. doi:10.14778/3681954.3682003
-
[9]
Chunwei Liu, Matthew Russo, Michael Cafarella, Lei Cao, Peter Baile Chen, Zui Chen, Michael Franklin, Tim Kraska, Samuel Madden, Rana Shahout, and Gerardo Vitagliano. 2025. Palimpzest: Optimizing AI-Powered Analytics with Declarative Query Processing. InProceedings of the Conference on Innovative Database Research (CIDR)(2025)
2025
- [10]
-
[11]
Yuyu Luo, Guoliang Li, Ju Fan, Chengliang Chai, and Nan Tang. 2025. Natural Language to SQL: State of the Art and Open Problems.Proc. VLDB Endow.18, 12 (Aug. 2025), 5466–5471. doi:10.14778/3750601.3750696
- [12]
-
[13]
Hasso Plattner. 2009. A common database approach for OLTP and OLAP us- ing an in-memory column database. InProceedings of the 2009 ACM SIGMOD International Conference on Management of Data(Providence, Rhode Island, USA) (SIGMOD ’09). Association for Computing Machinery, New York, NY, USA, 1–2. doi:10.1145/1559845.1559846
- [14]
-
[15]
Jocelyn Shen, Nicolai Marquardt, Hugo Romat, Ken Hinckley, Nathalie Riche, and Fanny Chevalier
Shreya Shankar, Tristan Chambers, Tarak Shah, Aditya G. Parameswaran, and Eugene Wu. 2025. DocETL: Agentic Query Rewriting and Evaluation for Complex Document Processing. arXiv:2410.12189 [cs.DB] https://arxiv.org/abs/2410.12189
- [16]
- [17]
-
[18]
Tao Yu, Rui Zhang, Kai Yang, Michihiro Yasunaga, Dongxu Wang, Zifan Li, James Ma, Irene Li, Qingning Yao, Shanelle Roman, Zilin Zhang, and Dragomir Radev. 2019. Spider: A Large-Scale Human-Labeled Dataset for Complex and Cross-Domain Semantic Parsing and Text-to-SQL Task. arXiv:1809.08887 [cs.CL] https://arxiv.org/abs/1809.08887
work page Pith review arXiv 2019
-
[19]
Matei Zaharia, Ion Stoica, Jerry Li, Peter Liu, et al. 2024. The Shift from Models to Compound AI Systems. Berkeley Artificial Intelligence Research (BAIR) Blog. https://bair.berkeley.edu/blog/2024/02/18/compound-ai-systems/
2024
- [20]
-
[21]
Junhao Zhu, Lu Chen, Xiangyu Ke, Ziquan Fang, Tianyi Li, Yunjun Gao, and Christian S. Jensen. 2025. Beyond Relational: Semantic-Aware Multi-Modal Analytics with LLM-Native Query Optimization. arXiv:2511.19830 [cs.DB] https: //arxiv.org/abs/2511.19830 A Hackathon Developer Experience Survey A.1 Study We conducted a study to investigate the developer experi...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.