Declarative Data Services: Structured Agentic Discovery for Composing Data Systems
Pith reviewed 2026-05-21 05:14 UTC · model grok-4.3
The pith
Structured agentic discovery using four typed contracts lets data-system compositions converge where unbounded search fails.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Declarative Data Services owns four typed contracts at successive layers (intent, operator DAG, per-system skills, runtime attribution) that decompose the global search into bounded sub-searches performed by sub-agents. The framework supplies channels for knowledge to flow forward as inline skill citations and for errors to route backward as typed signals. In a proof-of-life demonstration on a trading-backend workload, this architecture converges to working stacks where unbounded agentic discovery does not, and runtime failures become skill patches cited inline in the next deployment.
What carries the argument
The four typed contracts (intent, operator DAG, per-system skills, runtime attribution) that break the global composition search into bounded sub-searches and route knowledge via inline citations and typed error signals.
If this is right
- Runtime failures are captured as reusable skill patches that later deployments cite directly.
- Sub-agents can succeed at their narrower, typed search spaces even when the overall composition problem remains large.
- Composition knowledge accumulates across deployments through the framework's citation and signal channels rather than depending solely on pretraining.
- Declarative user intent can drive end-to-end composition of heterogeneous data systems without requiring the agent to maintain the entire search space in one context.
Where Pith is reading between the lines
- The same layered-contract pattern could be applied to composing infrastructure stacks or scientific workflows where components also interact through typed interfaces.
- Runtime attribution might reduce reliance on exhaustive pretraining by letting the system learn system-specific behaviors from actual deployments.
- Extending the contracts to include cost or latency objectives could turn the current convergence result into an optimization method.
- The approach suggests that many agentic discovery tasks become tractable once the search is factored into contract-defined layers rather than left fully open-ended.
Load-bearing premise
That the four contracts can be defined and maintained so sub-agents reliably perform their bounded searches and that inline citations plus typed errors suffice to carry useful knowledge across iterations.
What would settle it
Run repeated trials of unbounded discovery versus DDS on the identical trading-backend workload and observe whether DDS produces a working stack in every trial while unbounded discovery continues to fail to converge.
Figures
read the original abstract
Agentic discovery has shown that LLM-driven search can find novel algorithms, designs, and code under benchmark conditions. Translating the paradigm to multi-system data backends surfaces a harder problem: the search space is heterogeneous, the verifier is whether a deployed stack actually runs, and composition knowledge is unevenly captured in pretraining. Unbounded agentic discovery, a coding agent iterating on failure-log feedback, fails to converge consistently on a working stack even when iteration and explicit composition knowledge are added. We propose Declarative Data Services (DDS), an architecture for structured agentic discovery of data-system compositions from declarative user intent. The framework owns four typed contracts at successive layers (intent, operator DAG, per-system skills, runtime attribution) that decompose the global search into bounded sub-searches; sub-agents search each typed space, while the framework provides the channels by which knowledge flows forward as inline skill citations and errors route backward as typed signals. As a proof of life on a trading-backend workload, DDS converges where unbounded discovery does not; runtime failures become skill patches that the next deployment cites inline. We position this as an early prototype reporting lessons from real-world data-system composition.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes Declarative Data Services (DDS), an architecture for structured agentic discovery of data-system compositions from declarative user intent using four typed contracts (intent, operator DAG, per-system skills, runtime attribution). These contracts decompose the heterogeneous search space into bounded sub-searches performed by sub-agents, with forward knowledge flow via inline skill citations and backward propagation of typed error signals. As a proof-of-life demonstration on a trading-backend workload, DDS is reported to converge on a working stack where unbounded agentic discovery (even with iteration and composition knowledge) fails, converting runtime failures into reusable skill patches cited in subsequent deployments.
Significance. If the central claim holds, the work offers a structured alternative to unbounded LLM-driven search for practical multi-system data backend composition, where verifiers are deployment success and pretraining knowledge is uneven. The explicit layering of contracts and bidirectional knowledge channels could generalize to other heterogeneous composition tasks; the positioning as an early prototype reporting real-world lessons is a constructive contribution even at this stage.
major comments (2)
- [Abstract and Evaluation] Abstract and Evaluation (proof-of-life section): The claim that DDS converges where unbounded discovery does not rests on a single unreported workload run without iteration counts, success rates, failure-mode distributions, baseline comparisons, or logs showing how the four contracts produced bounded sub-searches. This is load-bearing for the central architectural claim and leaves open whether convergence arises from the contract structure, workload simplicity, or unstated human tuning.
- [Architecture] Architecture (contract definitions): The assumption that the four typed contracts can be maintained so that sub-agents reliably bound their searches and that inline citations plus typed runtime attribution signals propagate root-cause knowledge (rather than generic errors) is asserted but not supported by ablation or tracing of signal flow across iterations in the reported case.
minor comments (2)
- [Introduction/Architecture] Add a dedicated subsection early in the paper that formally defines the interfaces and invariants of each of the four typed contracts to improve readability for readers unfamiliar with the layered approach.
- [Related Work] Expand the related-work discussion to include recent agentic discovery systems and data-system composition frameworks for clearer positioning.
Simulated Author's Rebuttal
We thank the referee for the detailed and constructive review. The comments correctly identify areas where the proof-of-life demonstration would benefit from greater transparency. We address each major comment below and outline the revisions we will make.
read point-by-point responses
-
Referee: [Abstract and Evaluation] Abstract and Evaluation (proof-of-life section): The claim that DDS converges where unbounded discovery does not rests on a single unreported workload run without iteration counts, success rates, failure-mode distributions, baseline comparisons, or logs showing how the four contracts produced bounded sub-searches. This is load-bearing for the central architectural claim and leaves open whether convergence arises from the contract structure, workload simplicity, or unstated human tuning.
Authors: We acknowledge that the current evaluation presents only a qualitative proof-of-life on one trading-backend workload and does not report quantitative metrics such as iteration counts, success rates, or detailed logs. The manuscript positions this as an early prototype illustrating that the contract structure enabled convergence where an unbounded baseline did not. In the revised manuscript we will expand the evaluation section to include available iteration counts, observed failure modes in the unbounded case, and a step-by-step description of how the four contracts produced bounded sub-searches in the reported run. We will also add a brief discussion of potential human tuning and workload characteristics to address the concern that convergence may not generalize from the contract design alone. revision: yes
-
Referee: [Architecture] Architecture (contract definitions): The assumption that the four typed contracts can be maintained so that sub-agents reliably bound their searches and that inline citations plus typed runtime attribution signals propagate root-cause knowledge (rather than generic errors) is asserted but not supported by ablation or tracing of signal flow across iterations in the reported case.
Authors: We agree that the manuscript asserts the utility of the four contracts and the bidirectional knowledge channels without providing explicit tracing or ablation evidence. The proof-of-life example shows the outcome but does not walk through signal propagation. In the revision we will add a new figure and accompanying text that traces the forward flow of inline skill citations and the backward propagation of typed error signals for the reported deployment. We will also include a short discussion of observed challenges in maintaining contract consistency and how root-cause attribution differed from generic error logs in the case study. revision: yes
Circularity Check
No circularity: architectural proposal without derivation chain or self-referential reduction
full rationale
The manuscript proposes Declarative Data Services as an architectural framework that decomposes agentic search into four typed contracts (intent, operator DAG, per-system skills, runtime attribution) to bound sub-searches and route knowledge via inline citations and typed error signals. This is presented as an original design with a proof-of-life demonstration on a trading-backend workload rather than any numerical derivation, fitted-parameter prediction, or equation that reduces to its own inputs by construction. No self-citation load-bearing steps, uniqueness theorems imported from prior author work, or ansatz smuggling appear in the description; the central claim of convergence where unbounded discovery fails is asserted via empirical illustration without reducing to a tautology or renamed known result. The proposal remains self-contained as an engineering architecture.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption LLM agents can perform effective bounded sub-searches when supplied with typed contracts and bidirectional knowledge channels.
invented entities (1)
-
Four typed contracts (intent, operator DAG, per-system skills, runtime attribution)
no independent evidence
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
The framework owns four typed contracts at successive layers (intent, operator DAG, per-system skills, runtime attribution) that decompose the global search into bounded sub-searches
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Alexander Novikov, Ngân V˜u, Marvin Eisenberger, Emilien Dupont, Po-Sen Huang, Adam Zsolt Wagner, Sergey Shirobokov, Borislav Kozlovskii, Francisco J. R. Ruiz, Abbas Mehrabian, M. Pawan Kumar, Abigail See, Swarat Chaudhuri, George Holland, Alex Davies, Sebastian Nowozin, Pushmeet Kohli, and Matej Balog. AlphaEvolve: A coding agent for scientific and algor...
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[2]
Pan, Alexander Du, Kurt Keutzer, Alvin Cheung, Alexandros G
Shu Liu, Shubham Agarwal, Monishwaran Maheswaran, Mert Cemri, Zhifei Li, Qiuyang Mang, Ashwin Naren, Ethan Boneh, Audrey Cheng, Melissa Z. Pan, Alexander Du, Kurt Keutzer, Alvin Cheung, Alexandros G. Dimakis, Koushik Sen, Matei Zaharia, and Ion Stoica. EvoX: Meta- evolution for automated discovery, 2026. URLhttps://arxiv.org/abs/2602.23413
-
[3]
AdaEvolve: Adaptive LLM driven zeroth-order optimization, 2026
Mert Cemri, Shubham Agrawal, Akshat Gupta, Shu Liu, Audrey Cheng, Qiuyang Mang, Ashwin Naren, Lutfi Eren Erdogan, Koushik Sen, Matei Zaharia, Alex Dimakis, and Ion Stoica. AdaEvolve: Adaptive LLM driven zeroth-order optimization, 2026. URL https: //arxiv.org/abs/2602.20133
-
[4]
GEPA: Reflective Prompt Evolution Can Outperform Reinforcement Learning
Lakshya A Agrawal, Shangyin Tan, Dilara Soylu, Noah Ziems, Rishi Khare, Krista Opsahl- Ong, Arnav Singhvi, Herumb Shandilya, Michael J Ryan, Meng Jiang, Christopher Potts, Koushik Sen, Alexandros G. Dimakis, Ion Stoica, Dan Klein, Matei Zaharia, and Omar Khattab. GEPA: Reflective prompt evolution can outperform reinforcement learning, 2026. URL https: //a...
work page internal anchor Pith review Pith/arXiv arXiv 2026
-
[5]
Glia: A human-inspired AI for automated systems design and optimization, 2026
Pouya Hamadanian, Pantea Karimi, Arash Nasr-Esfahany, Kimia Noorbakhsh, Joseph Chandler, Ali ParandehGheibi, Mohammad Alizadeh, and Hari Balakrishnan. Glia: A human-inspired AI for automated systems design and optimization, 2026. URL https://arxiv.org/abs/2510. 27176
work page 2026
-
[6]
Claude Code: An agentic coding tool
Anthropic. Claude Code: An agentic coding tool. https://www.anthropic.com/ claude-code, 2025. Accessed April 2026
work page 2025
-
[7]
Jennie Duggan, Aaron J. Elmore, Michael Stonebraker, Magda Balazinska, Bill Howe, Jeremy Kepner, Sam Madden, David Maier, Tim Mattson, and Stan Zdonik. The BigDAWG polystore system.SIGMOD Rec., 44(2):11–16, August 2015. ISSN 0163-5808. doi: 10.1145/2814710. 2814713. URLhttps://doi.org/10.1145/2814710.2814713
-
[8]
Dana Van Aken, Andrew Pavlo, Geoffrey J. Gordon, and Bohan Zhang. Automatic database management system tuning through large-scale machine learning. InProceedings of the 2017 ACM International Conference on Management of Data, SIGMOD ’17, pages 1009–1024, New York, NY , USA, 2017. Association for Computing Machinery. ISBN 9781450341974. doi: 10.1145/303591...
-
[9]
dbt (data build tool).https://www.getdbt.com/, . Accessed April 2026
work page 2026
- [10]
- [11]
-
[12]
Benoit Dageville, Thierry Cruanes, Marcin Zukowski, Vadim Antonov, Artin Avanes, Jon Bock, Jonathan Claybaugh, Daniel Engovatov, Martin Hentschel, Jiansheng Huang, Allison W. Lee, Ashish Motivala, Abdul Q. Munir, Steven Pelley, Peter Povinec, Greg Rahn, Spyridon Triantafyllis, and Philipp Unterbrunner. The Snowflake elastic data warehouse. InProceedings o...
-
[13]
Lakehouse: A new generation of open platforms that unify data warehousing and advanced analytics
Michael Armbrust, Ali Ghodsi, Reynold Xin, and Matei Zaharia. Lakehouse: A new generation of open platforms that unify data warehousing and advanced analytics. InConference on Innovative Data Systems Research, 2021. URL https://vldb.org/cidrdb/papers/2021/ cidr2021_paper17.pdf
work page 2021
-
[14]
DB-Engines ranking.https://db-engines.com/en/ranking. Accessed April 2026. 10
work page 2026
-
[15]
Shu Liu, Soujanya Ponnapalli, Shreya Shankar, Sepanta Zeighami, Alan Zhu, Shubham Agarwal, Ruiqi Chen, Samion Suwito, Shuo Yuan, Ion Stoica, Matei Zaharia, Alvin Cheung, Natacha Crooks, Joseph E. Gonzalez, and Aditya G. Parameswaran. Supporting our ai overlords: Redesigning data systems to be agent-first. 2025. URL https://arxiv.org/abs/2509. 00997
work page 2025
- [16]
-
[17]
https://developer.hashicorp.com/terraform/docs/tools/ mcp-server
Terraform MCP server. https://developer.hashicorp.com/terraform/docs/tools/ mcp-server. Accessed April 2026
work page 2026
-
[18]
Patrick Tser Jern Kon, Jiachen Liu, Yiming Qiu, Weijun Fan, Ting He, Lei Lin, Hao- ran Zhang, Owen M. Park, George S. Elengikal, Yuxin Kang, Ang Chen, Mosharaf Chowdhury, Myungjin Lee, and Xinyu Wang. Iac-eval: A code generation benchmark for cloud infrastructure-as-code programs. In A. Globerson, L. Mackey, D. Belgrave, A. Fan, U. Paquet, J. Tomczak, and...
-
[19]
Martin Kleppmann.Designing Data-Intensive Applications. O’Reilly Media, 2017. ISBN 978-1449373320
work page 2017
-
[20]
Michael Stonebraker and U˘gur Çetintemel."One size fits all": an idea whose time has come and gone, pages 441–462. Association for Computing Machinery and Morgan & Claypool,
-
[21]
URLhttps://doi.org/10.1145/3226595.3226636
ISBN 9781947487192. URLhttps://doi.org/10.1145/3226595.3226636
-
[22]
Why Do Multi-Agent LLM Systems Fail?
Mert Cemri, Melissa Z. Pan, Shuyi Yang, Lakshya A. Agrawal, Bhavya Chopra, Rishabh Tiwari, Kurt Keutzer, Aditya Parameswaran, Dan Klein, Kannan Ramchandran, Matei Zaharia, Joseph E. Gonzalez, and Ion Stoica. Why do multi-agent LLM systems fail?, 2025. URL https://arxiv.org/abs/2503.13657
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[23]
Multi-agent teams hold experts back, 2026
Aneesh Pappu, Batu El, Hancheng Cao, Carmelo di Nolfo, Yanchao Sun, Meng Cao, and James Zou. Multi-agent teams hold experts back, 2026. URL https://arxiv.org/abs/ 2602.01011
-
[24]
Dimakis, Matei Zaharia, and Ion Stoica
Shu Liu, Mert Cemri, Shubham Agarwal, Alexander Krentsel, Ashwin Naren, Qiuyang Mang, Zhifei Li, Akshat Gupta, Monishwaran Maheswaran, Audrey Cheng, Melissa Pan, Ethan Boneh, Kannan Ramchandran, Koushik Sen, Alexandros G. Dimakis, Matei Zaharia, and Ion Stoica. SkyDiscover: A flexible framework for AI-driven scientific and algorithmic discovery, 2026. URL...
work page 2026
-
[25]
OpenEvolve: an open-source evolutionary coding agent, 2025
Asankhaya Sharma. OpenEvolve: an open-source evolutionary coding agent, 2025. URL https://github.com/algorithmicsuperintelligence/openevolve
work page 2025
-
[26]
ShinkaEvolve: Towards Open-Ended And Sample-Efficient Program Evolution
Robert Tjarko Lange, Yuki Imajuku, and Edoardo Cetin. ShinkaEvolve: Towards open-ended and sample-efficient program evolution, 2025. URL https://arxiv.org/abs/2509.19349
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[27]
Agentic Context Engineering: Evolving Contexts for Self-Improving Language Models
Qizheng Zhang, Changran Hu, Shubhangi Upasani, Boyuan Ma, Fenglu Hong, Vamsidhar Kamanuru, Jay Rainton, Chen Wu, Mengmeng Ji, Hanchen Li, Urmish Thakker, James Zou, and Kunle Olukotun. Agentic context engineering: Evolving contexts for self-improving language models, 2026. URLhttps://arxiv.org/abs/2510.04618
work page internal anchor Pith review Pith/arXiv arXiv 2026
-
[28]
DSPy: Compiling Declarative Language Model Calls into Self-Improving Pipelines
Omar Khattab, Arnav Singhvi, Paridhi Maheshwari, Zhiyuan Zhang, Keshav Santhanam, Sri Vardhamanan, Saiful Haq, Ashutosh Sharma, Thomas T. Joshi, Hanna Moazam, Heather Miller, Matei Zaharia, and Christopher Potts. DSPy: Compiling declarative language model calls into self-improving pipelines, 2023. URLhttps://arxiv.org/abs/2310.03714
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[29]
Meta-Harness: End-to-End Optimization of Model Harnesses
Yoonho Lee, Roshen Nair, Qizheng Zhang, Kangwook Lee, Omar Khattab, and Chelsea Finn. Meta-Harness: End-to-end optimization of model harnesses, 2026. URL https://arxiv. org/abs/2603.28052. 11
work page internal anchor Pith review Pith/arXiv arXiv 2026
-
[30]
Melissa Z. Pan, Negar Arabzadeh, Riccardo Cogo, Yuxuan Zhu, Alexander Xiong, Lakshya A Agrawal, Huanzhi Mao, Emma Shen, Sid Pallerla, Liana Patel, Shu Liu, Tianneng Shi, Xiaoyuan Liu, Jared Quincy Davis, Emmanuele Lacavalla, Alessandro Basile, Shuyi Yang, Paul Castro, Daniel Kang, Joseph E. Gonzalez, Koushik Sen, Dawn Song, Ion Stoica, Matei Zaharia, and ...
work page 2026
-
[31]
Liana Patel, Siddharth Jha, Melissa Pan, Harshit Gupta, Parth Asawa, Carlos Guestrin, and Matei Zaharia. Semantic operators and their optimization: Enabling llm-based data processing with accuracy guarantees in lotus.Proc. VLDB Endow., 18(11):4171–4184, July 2025. ISSN 2150-8097. doi: 10.14778/3749646.3749685. URL https://doi.org/10.14778/3749646. 3749685
-
[32]
Shreya Shankar, Tristan Chambers, Tarak Shah, Aditya G. Parameswaran, and Eugene Wu. DocETL: Agentic query rewriting and evaluation for complex document processing.Proc. VLDB Endow., 18(9):3035–3048, May 2025. ISSN 2150-8097. doi: 10.14778/3746405.3746426. URLhttps://doi.org/10.14778/3746405.3746426
-
[33]
A declarative system for optimizing ai workloads, 2024
Chunwei Liu, Matthew Russo, Michael Cafarella, Lei Cao, Peter Baille Chen, Zui Chen, Michael Franklin, Tim Kraska, Samuel Madden, and Gerardo Vitagliano. A declarative system for optimizing ai workloads, 2024. URLhttps://arxiv.org/abs/2405.14696
-
[34]
dbt Mesh.https://www.getdbt.com/product/dbt-mesh, . Accessed April 2026
work page 2026
- [35]
-
[36]
Jimenez, Alexander Wettig, Kilian Lieret, Shunyu Yao, Karthik Narasimhan, and Ofir Press
John Yang, Carlos E. Jimenez, Alexander Wettig, Kilian Lieret, Shunyu Yao, Karthik Narasimhan, and Ofir Press. SWE-agent: agent-computer interfaces enable automated soft- ware engineering. InProceedings of the 38th International Conference on Neural Information Processing Systems, NIPS ’24, Red Hook, NY , USA, 2024. Curran Associates Inc. ISBN 9798331314385
work page 2024
-
[37]
MLE-bench: Evaluating Machine Learning Agents on Machine Learning Engineering
Jun Shern Chan, Neil Chowdhury, Oliver Jaffe, James Aung, Dane Sherburn, Evan Mays, Giulio Starace, Kevin Liu, Leon Maksin, Tejal Patwardhan, Lilian Weng, and Aleksander M ˛ adry. MLE-bench: Evaluating machine learning agents on machine learning engineering, 2025. URL https://arxiv.org/abs/2410.07095
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[38]
DS-1000: a natural and reliable benchmark for data science code generation
Yuhang Lai, Chengxi Li, Yiming Wang, Tianyi Zhang, Ruiqi Zhong, Luke Zettlemoyer, Wen-tau Yih, Daniel Fried, Sida Wang, and Tao Yu. DS-1000: a natural and reliable benchmark for data science code generation. InProceedings of the 40th International Conference on Machine Learning, ICML’23. JMLR.org, 2023
work page 2023
-
[39]
Raghav Sethi, Martin Traverso, Dain Sundstrom, David Phillips, Wenlei Xie, Yutian Sun, Nezih Yegitbasi, Haozhun Jin, Eric Hwang, Nileema Shingte, and Christopher Berner. Presto: SQL on everything. In2019 IEEE 35th International Conference on Data Engineering (ICDE), pages 1802–1813, 2019. doi: 10.1109/ICDE.2019.00196
-
[40]
Xin, Cheng Lian, Yin Huai, Davies Liu, Joseph K
Michael Armbrust, Reynold S. Xin, Cheng Lian, Yin Huai, Davies Liu, Joseph K. Bradley, Xiangrui Meng, Tomer Kaftan, Michael J. Franklin, Ali Ghodsi, and Matei Zaharia. Spark SQL: Relational data processing in Spark. InProceedings of the 2015 ACM SIGMOD International Conference on Management of Data, SIGMOD ’15, pages 1383–1394, New York, NY , USA,
work page 2015
-
[41]
Association for Computing Machinery. ISBN 9781450327589. doi: 10.1145/2723372. 2742797. URLhttps://doi.org/10.1145/2723372.2742797
-
[42]
HyPer: A hybrid OLTP&OLAP main memory database system based on virtual memory snapshots
Alfons Kemper and Thomas Neumann. HyPer: A hybrid OLTP&OLAP main memory database system based on virtual memory snapshots. In2011 IEEE 27th International Conference on Data Engineering, pages 195–206, 2011. doi: 10.1109/ICDE.2011.5767867
-
[43]
Franz Färber, Sang Kyun Cha, Jürgen Primsch, Christof Bornhövd, Stefan Sigg, and Wolfgang Lehner. SAP HANA database: data management for modern business applications.SIGMOD Rec., 40(4):45–51, January 2012. ISSN 0163-5808. doi: 10.1145/2094114.2094126. URL https://doi.org/10.1145/2094114.2094126. 12
-
[44]
https://engineering.fb.com/2022/05/04/data-infrastructure/delta/
Jeff Shute, Radek Vingralek, Bart Samwel, Ben Handy, Chad Whipkey, Eric Rollins, Mircea Oancea, Kyle Littlefield, David Menestrina, Stephan Ellner, John Cieslewicz, Ian Rae, Traian Stancescu, and Himani Apte. F1: a distributed SQL database that scales.Proc. VLDB Endow., 6(11):1068–1079, August 2013. ISSN 2150-8097. doi: 10.14778/2536222.2536232. URL https...
-
[45]
A New Presumed Commit Optimization for Two Phase Commit
James C. Corbett, Jeffrey Dean, Michael Epstein, Andrew Fikes, Christopher Frost, J. J. Furman, Sanjay Ghemawat, Andrey Gubarev, Christopher Heiser, Peter Hochschild, Wilson Hsieh, Sebastian Kanthak, Eugene Kogan, Hongyi Li, Alexander Lloyd, Sergey Melnik, David Mwaura, David Nagle, Sean Quinlan, Rajesh Rao, Lindsay Rolig, Yasushi Saito, Michal Szymaniak,...
-
[46]
Andrew Pavlo, Gustavo Angulo, Joy Arulraj, Haibin Lin, Jiexi Lin, Lin Ma, Prashanth Menon, Todd C. Mowry, Matthew Perron, Ian Quah, Siddharth Santurkar, Anthony Tomasic, Skye Toor, Dana Van Aken, Ziqi Wang, Yingjun Wu, Ran Xian, and Tieying Zhang. Self-driving database management systems. InConference on Innovative Data Systems Research, 2017. URLhttps://...
work page 2017
-
[47]
An end-to-end automatic cloud database tuning system using deep reinforcement learning
Ji Zhang, Yu Liu, Ke Zhou, Guoliang Li, Zhili Xiao, Bin Cheng, Jiashu Xing, Yangtao Wang, Tianheng Cheng, Li Liu, Minwei Ran, and Zekang Li. An end-to-end automatic cloud database tuning system using deep reinforcement learning. InProceedings of the 2019 International Conference on Management of Data, SIGMOD ’19, pages 415–432, New York, NY , USA, 2019. A...
-
[48]
Tao Yu, Rui Zhang, Kai Yang, Michihiro Yasunaga, Dongxu Wang, Zifan Li, James Ma, Irene Li, Qingning Yao, Shanelle Roman, Zilin Zhang, and Dragomir Radev. Spider: A large-scale human-labeled dataset for complex and cross-domain semantic parsing and text-to-SQL task,
-
[49]
URLhttps://arxiv.org/abs/1809.08887
work page internal anchor Pith review Pith/arXiv arXiv
-
[50]
Chang, Fei Huang, Reynold Cheng, and Yongbin Li
Jinyang Li, Binyuan Hui, Ge Qu, Jiaxi Yang, Binhua Li, Bowen Li, Bailin Wang, Bowen Qin, Ruiying Geng, Nan Huo, Xuanhe Zhou, Chenhao Ma, Guoliang Li, Kevin C.C. Chang, Fei Huang, Reynold Cheng, and Yongbin Li. Can LLM already serve as a database interface? a big bench for large-scale database grounded text-to-SQLs. InProceedings of the 37th International ...
work page 2023
-
[51]
DIN-SQL: decomposed in-context learning of text-to-SQL with self-correction
Mohammadreza Pourreza and Davood Rafiei. DIN-SQL: decomposed in-context learning of text-to-SQL with self-correction. InProceedings of the 37th International Conference on Neural Information Processing Systems, NIPS ’23, Red Hook, NY , USA, 2023. Curran Associates Inc
work page 2023
-
[52]
E. F. Codd. A relational model of data for large shared data banks.Commun. ACM, 13 (6):377–387, June 1970. ISSN 0001-0782. doi: 10.1145/362384.362685. URL https: //doi.org/10.1145/362384.362685
-
[53]
Donald D. Chamberlin and Raymond F. Boyce. Sequel: A structured english query language. InProceedings of the 1974 ACM SIGFIDET (Now SIGMOD) Workshop on Data Description, Access and Control, SIGFIDET ’74, pages 249–264, New York, NY , USA, 1974. Association for Computing Machinery. ISBN 9781450374156. doi: 10.1145/800296.811515. URL https://doi.org/10.1145...
-
[54]
Inefficiencies of meta agents for agent design,
Batu El, Mert Yuksekgonul, and James Zou. Inefficiencies of meta agents for agent design,
- [55]
-
[56]
Barbarians at the gate: How AI is upending systems research, 2025
Audrey Cheng, Shu Liu, Melissa Pan, Zhifei Li, Bowen Wang, Alex Krentsel, Tian Xia, Mert Cemri, Jongseok Park, Shuo Yang, Jeff Chen, Lakshya Agrawal, Aditya Desai, Jiarong Xing, Koushik Sen, Matei Zaharia, and Ion Stoica. Barbarians at the gate: How AI is upending systems research, 2025. URLhttps://arxiv.org/abs/2510.06189. 13
-
[57]
Let the barbarians in: How AI can accelerate systems performance research, 2025
Audrey Cheng, Shu Liu, Melissa Pan, Zhifei Li, Shubham Agarwal, Mert Cemri, Bowen Wang, Alexander Krentsel, Tian Xia, Jongseok Park, Shuo Yang, Jeff Chen, Lakshya Agrawal, Ashwin Naren, Shulu Li, Ruiying Ma, Aditya Desai, Jiarong Xing, Koushik Sen, Matei Zaharia, and Ion Stoica. Let the barbarians in: How AI can accelerate systems performance research, 20...
-
[58]
Cost-of-Pass: An economic framework for evaluating language models, 2026
Mehmet Hamza Erol, Batu El, Mirac Suzgun, Mert Yuksekgonul, and James Zou. Cost-of-Pass: An economic framework for evaluating language models, 2026. URL https://arxiv.org/ abs/2504.13359
-
[59]
V oyager: An open-ended embodied agent with large language models,
Guanzhi Wang, Yuqi Xie, Yunfan Jiang, Ajay Mandlekar, Chaowei Xiao, Yuke Zhu, Linxi Fan, and Anima Anandkumar. V oyager: An open-ended embodied agent with large language models,
-
[60]
URLhttps://arxiv.org/abs/2305.16291
work page internal anchor Pith review Pith/arXiv arXiv
-
[61]
Equipping agents for the real world with agent skills
Anthropic. Equipping agents for the real world with agent skills. https://www.anthropic. com/engineering/equipping-agents-for-the-real-world-with-agent-skills ,
-
[62]
Accessed April 2026. A Example Agent Skill: ClickHouse Figure A shows a trimmed excerpt of clickhouse.yaml skill with one representative entry per block. The dated comments are real attribution-log entries: each was added after a specific failure during the learning-loop experiment (§4.3), which is the traceability property cited in §3. B Per-run detail f...
work page 2026
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.