pith. machine review for the scientific record. sign in

arxiv: 2604.07041 · v1 · submitted 2026-04-08 · 💻 cs.DB · cs.AI· cs.ET· cs.HC· cs.IR

Recognition: unknown

AV-SQL: Decomposing Complex Text-to-SQL Queries with Agentic Views

Authors on Pith no claims yet

Pith reviewed 2026-05-10 17:05 UTC · model grok-4.3

classification 💻 cs.DB cs.AIcs.ETcs.HCcs.IR
keywords text-to-SQLagentic viewscommon table expressionsquery decompositionlarge language modelsdatabase schemasSQL generation
0
0 comments X

The pith

Agent-generated common table expressions decompose complex natural language database queries for large language models.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proposes a framework that splits the conversion of natural language questions into SQL statements into stages handled by separate agents. Central to it are agentic views, which are generated intermediate table definitions that hold partial logic and select only the needed parts of a large database structure. This setup tackles cases where the entire database description is too big to fit in one prompt and where single-pass generation produces broken queries. Experiments on several benchmarks indicate the pipeline produces more executable and accurate results than prior methods, especially on tests with intricate multi-table reasoning. If the approach works as described, non-experts could query complex real-world databases more reliably without writing code themselves.

Core claim

AV-SQL decomposes complex Text-to-SQL into a pipeline of specialized LLM agents. Central to AV-SQL is the concept of agentic views: agent-generated Common Table Expressions (CTEs) that encapsulate intermediate query logic and filter relevant schema elements from large schemas. AV-SQL operates in three stages: (1) a rewriter agent compresses and clarifies the input query; (2) a view generator agent processes schema chunks to produce agentic views; and (3) a planner, generator, and revisor agent collaboratively compose these views into the final SQL query.

What carries the argument

Agentic views: agent-generated Common Table Expressions (CTEs) that encapsulate intermediate query logic and filter relevant schema elements from large schemas.

Load-bearing premise

That the agent-generated CTE views reliably encapsulate intermediate logic and correctly filter relevant schema elements from large schemas without introducing semantic errors or losing critical information needed for the final query.

What would settle it

A test set of queries requiring a join or filter across tables that the view generator might skip; if the final SQL runs without error but returns incorrect rows because a required table or condition was omitted from the views, the central claim is falsified.

Figures

Figures reproduced from arXiv: 2604.07041 by Hongzhi Yin, Minh Tam Pham, Quoc Viet Hung Nguyen, Thanh Tam Nguyen, Tong Chen, Trinh Pham.

Figure 1
Figure 1. Figure 1: A motivating example from BIRD-dev (sample 501) [PITH_FULL_IMAGE:figures/full_fig_p001_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Overview of AV-SQL. The framework begins with preprocessing—schema splitting and vector database initialization for value preprocessing—followed by three main stages: (1) Question Rewriting which reformulates the input question, optionally using external knowledge, into a clearer and more explicit form; (2) Agent View Generation, which decomposes the long-context schema into smaller chunks and produces exe… view at source ↗
Figure 3
Figure 3. Figure 3: Token-length distribution of large database schemas [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Recall Execution Accuracy on the Spider2.0-Snow [PITH_FULL_IMAGE:figures/full_fig_p010_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Token usage and runtime breakdown by pipeline [PITH_FULL_IMAGE:figures/full_fig_p012_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Error taxonomy of AV-SQL with Gemini-3-Pro on Spider2-snow [PITH_FULL_IMAGE:figures/full_fig_p012_6.png] view at source ↗
read the original abstract

Text-to-SQL is the task of translating natural language queries into executable SQL for a given database, enabling non-expert users to access structured data without writing SQL manually. Despite rapid advances driven by large language models (LLMs), existing approaches still struggle with complex queries in real-world settings, where database schemas are large and questions require multi-step reasoning over many interrelated tables. In such cases, providing the full schema often exceeds the context window, while one-shot generation frequently produces non-executable SQL due to syntax errors and incorrect schema linking. To address these challenges, we introduce AV-SQL, a framework that decomposes complex Text-to-SQL into a pipeline of specialized LLM agents. Central to AV-SQL is the concept of agentic views: agent-generated Common Table Expressions (CTEs) that encapsulate intermediate query logic and filter relevant schema elements from large schemas. AV-SQL operates in three stages: (1) a rewriter agent compresses and clarifies the input query; (2) a view generator agent processes schema chunks to produce agentic views; and (3) a planner, generator, and revisor agent collaboratively compose these views into the final SQL query. Extensive experiments show that AV-SQL achieves 70.38% execution accuracy on the challenging Spider 2.0 benchmark, outperforming state-of-the-art baselines, while remaining competitive on standard datasets with 85.59% on Spider, 72.16% on BIRD and 63.78% on KaggleDBQA. Our source code is available at https://github.com/pminhtam/AV-SQL.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. AV-SQL is a multi-agent LLM framework for Text-to-SQL that decomposes complex queries using a rewriter agent, a view generator that produces agentic views (LLM-generated CTEs from schema chunks), and a planner/generator/revisor pipeline to compose the final SQL. The paper claims execution accuracies of 85.59% on Spider, 72.16% on BIRD, 63.78% on KaggleDBQA, and 70.38% on Spider 2.0, outperforming state-of-the-art baselines, with source code released.

Significance. If the performance gains can be robustly attributed to the agentic views mechanism, the work would advance Text-to-SQL for large-schema, multi-step queries by providing a structured decomposition that mitigates context limits and linking errors. The open-source release is a clear strength for reproducibility. However, without isolating the contribution of the views, the significance remains provisional.

major comments (3)
  1. [§4] §4 (Experiments): The reported 70.38% execution accuracy on Spider 2.0 and outperformance claims lack any ablation studies that isolate the agentic views (e.g., comparing the full pipeline to a version without view generation or with direct schema input), so the central empirical result cannot yet be attributed to the proposed mechanism rather than the multi-agent structure alone.
  2. [§3.2] §3.2 (View Generator Agent): No per-stage evaluation of agentic view quality is provided, such as a correctness metric for the generated CTEs, manual inspection for semantic errors in predicates/joins, or analysis of cross-chunk schema omissions; this is load-bearing because the framework relies on these LLM-generated intermediates being reliable without introducing incorrect logic.
  3. [§4.1] §4.1 (Experimental Setup): Details on controls for the stochastic LLM agents (temperature, number of runs, prompt sensitivity, or error analysis of failure cases) are absent, undermining confidence in the benchmark numbers and the claim that gains stem from reliable decomposition.
minor comments (2)
  1. [§3] The description of schema chunking in §3 lacks explicit pseudocode or parameters for chunk size and overlap, which would clarify how cross-chunk dependencies are handled.
  2. Table or figure captions for benchmark results should include exact baseline names and versions for direct comparison.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the thoughtful and constructive feedback on our manuscript. The comments highlight important areas for strengthening the empirical support and transparency of AV-SQL. We agree with the need for clearer isolation of the agentic views contribution, per-stage analysis, and experimental controls. We will revise the paper accordingly, as detailed in the point-by-point responses below.

read point-by-point responses
  1. Referee: [§4] §4 (Experiments): The reported 70.38% execution accuracy on Spider 2.0 and outperformance claims lack any ablation studies that isolate the agentic views (e.g., comparing the full pipeline to a version without view generation or with direct schema input), so the central empirical result cannot yet be attributed to the proposed mechanism rather than the multi-agent structure alone.

    Authors: We agree that ablation studies are necessary to isolate the contribution of agentic views from the broader multi-agent pipeline. In the revised manuscript, we will add a dedicated ablation subsection in §4. This will include results on Spider 2.0 for: (1) the full AV-SQL pipeline, (2) a variant without the view generator (directly feeding schema chunks to the planner/generator), and (3) a single-agent baseline using the full schema. These comparisons will quantify the specific benefit of the decomposition mechanism. revision: yes

  2. Referee: [§3.2] §3.2 (View Generator Agent): No per-stage evaluation of agentic view quality is provided, such as a correctness metric for the generated CTEs, manual inspection for semantic errors in predicates/joins, or analysis of cross-chunk schema omissions; this is load-bearing because the framework relies on these LLM-generated intermediates being reliable without introducing incorrect logic.

    Authors: We acknowledge that evaluating the quality of the generated agentic views is essential given their central role. We will add a new analysis subsection (likely in §3.2 or §4) that includes: manual inspection of a random sample of 50 generated views for semantic correctness, join accuracy, and predicate fidelity; an automated metric based on whether each view executes successfully against the database; and discussion of cross-chunk schema coverage by comparing view columns to the full schema. This will provide evidence on the reliability of the intermediates. revision: yes

  3. Referee: [§4.1] §4.1 (Experimental Setup): Details on controls for the stochastic LLM agents (temperature, number of runs, prompt sensitivity, or error analysis of failure cases) are absent, undermining confidence in the benchmark numbers and the claim that gains stem from reliable decomposition.

    Authors: We will expand §4.1 with the missing experimental controls. Specifically, we will report: the temperature settings used for each agent (e.g., 0.0 for the generator to promote determinism where possible), the number of independent runs (with mean and standard deviation), prompt sensitivity tests on a subset of queries, and a categorized error analysis of failure cases on Spider 2.0 distinguishing view-generation errors from final-SQL errors. This will increase confidence in the reported numbers. revision: yes

Circularity Check

0 steps flagged

No circularity; empirical benchmark results with no derivation chain

full rationale

The paper describes an agentic pipeline (rewriter, view-generator, planner/generator/revisor) that produces CTE views and final SQL, then reports execution accuracies on public benchmarks (Spider 2.0 at 70.38%, Spider at 85.59%, etc.). No equations, parameter fits, uniqueness theorems, or first-principles derivations appear; the central claims are end-to-end empirical measurements rather than any quantity that reduces to its own inputs by construction. Self-citations, if present, are not load-bearing for any claimed result. The work is therefore self-contained empirical engineering with no detectable circularity.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 1 invented entities

The framework rests on the assumption that LLM agents can reliably produce correct intermediate CTEs and that schema chunking plus view generation preserves query semantics; no explicit free parameters or invented entities are named in the abstract.

axioms (2)
  • domain assumption LLM agents can generate syntactically and semantically valid CTEs that correctly capture intermediate query logic from schema chunks
    Invoked implicitly by the view generator agent stage described in the abstract
  • domain assumption Decomposing the query via views avoids context-window overflow and reduces syntax/schema-linking errors compared with one-shot generation
    Stated as the motivation for the three-stage pipeline
invented entities (1)
  • agentic view no independent evidence
    purpose: An LLM-generated CTE that encapsulates intermediate query logic and filters relevant schema elements
    Central new construct introduced to enable decomposition; no independent evidence provided beyond the performance numbers

pith-pipeline@v0.9.0 · 5611 in / 1458 out tokens · 40251 ms · 2026-05-10T17:05:20.540210+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

53 extracted references · 20 canonical work pages · 5 internal anchors

  1. [1]

    Josh Achiam, Steven Adler, Sandhini Agarwal, Lama Ahmad, Ilge Akkaya, Floren- cia Leoni Aleman, Diogo Almeida, Janko Altenschmidt, Sam Altman, Shyamal Anadkat, et al. 2023. Gpt-4 technical report.arXiv preprint arXiv:2303.08774 (2023)

  2. [2]

    Scott Barnett, Stefanus Kurniawan, Srikanth Thudumu, Zach Brannelly, and Mohamed Abdelrazek. 2024. Seven failure points when engineering a retrieval augmented generation system. InCAIN. 194–199

  3. [3]

    Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, et al. 2020. Language models are few-shot learners.NeurIPS33 (2020), 1877–1901

  4. [4]

    Ruisheng Cao, Lu Chen, Zhi Chen, Yanbin Zhao, Su Zhu, and Kai Yu. 2021. LGESQL: Line Graph Enhanced Text-to-SQL Model with Mixed Local and Non- Local Relations. InIJCNLP. 2541–2555

  5. [5]

    Yeounoh Chung, Gaurav T Kakkar, Yu Gan, Brenton Milne, and Fatma Ozcan

  6. [6]

    Is long context all you need? leveraging LLM’s extended context for NL2SQL.PVLDB18, 8 (2025), 2735–2747

  7. [7]

    Mayur Datar, Nicole Immorlica, Piotr Indyk, and Vahab S Mirrokni. 2004. Locality- sensitive hashing scheme based on p-stable distributions. InSoCG. 253–262

  8. [8]

    Minghang Deng, Ashwin Ramachandran, Canwen Xu, Lanxiang Hu, Zhewei Yao, Anupam Datta, and Hao Zhang. 2025. Reforce: A Text-to-SQL agent with self- refinement, format restriction, and column exploration. InICLR 2025 Workshop: VerifAI: AI Verification in the Wild

  9. [9]

    Xuemei Dong, Chao Zhang, Yuhang Ge, Yuren Mao, Yunjun Gao, Jinshu Lin, Dongfang Lou, et al. 2023. C3: Zero-shot text-to-sql with chatgpt.arXiv preprint arXiv:2307.07306(2023)

  10. [10]

    Ju Fan, Zihui Gu, Songyue Zhang, Yuxin Zhang, Zui Chen, Lei Cao, Guoliang Li, Samuel Madden, Xiaoyong Du, and Nan Tang. 2024. Combining small language models and large language models for zero-shot NL2SQL.PVLDB17, 11 (2024), 2750–2763

  11. [11]

    Catherine Finegan-Dollak, Jonathan K Kummerfeld, Li Zhang, Karthik Ra- manathan, Sesh Sadasivam, Rui Zhang, and Dragomir Radev. 2018. Improving text-to-sql evaluation methodology.arXiv preprint arXiv:1806.09029(2018)

  12. [12]

    Han Fu, Chang Liu, Bin Wu, Feifei Li, Jian Tan, and Jianling Sun. 2023. Catsql: Towards real world natural language to sql applications.PVLDB16, 6 (2023), 1534–1547

  13. [13]

    Dawei Gao, Haibin Wang, Yaliang Li, Xiuyu Sun, Yichen Qian, Bolin Ding, and Jingren Zhou. 2024. Text-to-sql empowered by large language models: A benchmark evaluation.PVLDB17, 5 (2024), 1132–1145

  14. [14]

    Yingqi Gao, Yifu Liu, Xiaoxia Li, Xiaorong Shi, Yin Zhu, Yiming Wang, Shiqi Li, Wei Li, Yuntao Hong, Zhiling Luo, et al. 2024. XiYan-SQL: A Multi-Generator Ensemble Framework for Text-to-SQL.arXiv preprint arXiv:2411.08599(2024)

  15. [15]

    Zhifeng Hao, Qibin Song, Ruichu Cai, and Boyan Xu. 2025. Text-to-SQL as Dual-State Reasoning: Integrating Adaptive Context and Progressive Generation. arXiv preprint arXiv:2511.21402(2025)

  16. [16]

    2025.Context rot: How increasing input tokens impacts llm performance

    Kelly Hong, Anton Troynikov, and Jeff Huber. 2025.Context rot: How increasing input tokens impacts llm performance. Technical Report. Chroma. https:// trychroma.com/research/context-rot

  17. [17]

    Chia-Hsuan Lee, Oleksandr Polozov, and Matthew Richardson. 2021. KaggleD- BQA: Realistic evaluation of text-to-SQL parsers. InIJCNLP. 2261–2273

  18. [18]

    Dongjun Lee, Choongwon Park, Jaehyuk Kim, and Heesoo Park. 2025. Mcs- sql: Leveraging multiple prompts and multiple-choice selection for text-to-sql generation. InCOLING. 337–353

  19. [19]

    Fangyu Lei, Jixuan Chen, Yuxiao Ye, Ruisheng Cao, Dongchan Shin, Hongjin Su, Zhaoqing Suo, Hongcheng Gao, Wenjing Hu, Pengcheng Yin, et al. 2024. Spider 2.0: Evaluating language models on real-world enterprise text-to-sql workflows. arXiv preprint arXiv:2411.07763(2024)

  20. [20]

    Patrick Lewis, Ethan Perez, Aleksandra Piktus, Fabio Petroni, Vladimir Karpukhin, Naman Goyal, Heinrich Küttler, Mike Lewis, Wen-tau Yih, Tim Rock- täschel, et al. 2020. Retrieval-augmented generation for knowledge-intensive nlp tasks.NeurIPS33 (2020), 9459–9474

  21. [21]

    Boyan Li, Yuyu Luo, Chengliang Chai, Guoliang Li, and Nan Tang. 2024. The dawn of natural language to sql: Are we fully ready?PVLDB17, 11 (July 2024), 3318–3331

  22. [22]

    Boyan Li, Jiayi Zhang, Ju Fan, Yanwei Xu, Chong Chen, Nan Tang, and Yuyu Luo. 2025. Alpha-sql: Zero-shot text-to-sql using monte carlo tree search.arXiv preprint arXiv:2502.17248(2025)

  23. [23]

    Fei Li and Hosagrahar V Jagadish. 2014. Constructing an interactive natural language interface for relational databases.PVLDB8, 1 (2014), 73–84

  24. [24]

    Haoyang Li, Jing Zhang, Cuiping Li, and Hong Chen. 2023. Resdsql: Decoupling schema linking and skeleton parsing for text-to-sql. InAAAI, Vol. 37. 13067– 13075

  25. [25]

    Haoyang Li, Jing Zhang, Hanbing Liu, Ju Fan, Xiaokang Zhang, Jun Zhu, Renjie Wei, Hongyan Pan, Cuiping Li, and Hong Chen. 2024. Codes: Towards building open-source language models for text-to-sql.PACMMOD2, 3 (2024), 1–28

  26. [26]

    Jinyang Li, Binyuan Hui, Ge Qu, Jiaxi Yang, Binhua Li, Bowen Li, Bailin Wang, Bowen Qin, Ruiying Geng, Nan Huo, et al . 2024. Can llm already serve as a database interface? a Modeling Ambiguityse grounded text-to-sqls.NeurIPS36 (2024), 42330–42357

  27. [27]

    Sheng-Chieh Lin, Akari Asai, Minghan Li, Barlas Oguz, Jimmy Lin, Yashar Mehdad, Wen-tau Yih, and Xilun Chen. 2023. How to train your dragon: Diverse augmentation towards generalizable dense retrieval.arXiv preprint arXiv:2302.07452(2023)

  28. [28]

    Aiwei Liu, Xuming Hu, Lijie Wen, and Philip S Yu. 2023. A comprehen- sive evaluation of ChatGPT’s zero-shot Text-to-SQL capability.arXiv preprint arXiv:2303.13547(2023)

  29. [29]

    Xinyu Liu, Shuyu Shen, Boyan Li, Peixian Ma, Runzhi Jiang, Yuxin Zhang, Ju Fan, Guoliang Li, Nan Tang, and Yuyu Luo. 2025. A survey of text-to-sql in the era of llms: Where are we, and where are we going?TKDE(2025)

  30. [30]

    Yuyu Luo, Guoliang Li, Ju Fan, Chengliang Chai, and Nan Tang. 2025. Natural language to sql: State of the art and open problems.PVLDB18, 12 (2025), 5466– 5471

  31. [31]

    Mohammadreza Pourreza, Hailong Li, Ruoxi Sun, Yeounoh Chung, Shayan Talaei, Gaurav Tarlok Kakkar, Yu Gan, Amin Saberi, Fatma Ozcan, and Sercan O Arik

  32. [32]

    Chase-sql: Multi-path reasoning and preference optimized candidate selection in text-to-sql.arXiv preprint arXiv:2410.01943(2024)

  33. [33]

    Mohammadreza Pourreza and Davood Rafiei. 2023. DIN-SQL: Decomposed In- Context Learning of Text-to-SQL with Self-Correction. InNeurIPS. 36339–36348

  34. [34]

    Mohammadreza Pourreza and Davood Rafiei. 2024. Dts-sql: Decomposed text- to-sql with small large language models.arXiv preprint arXiv:2402.01117(2024)

  35. [35]

    Jiexing Qi, Jingyao Tang, Ziwei He, Xiangpeng Wan, Yu Cheng, Chenghu Zhou, Xinbing Wang, Quanshi Zhang, and Zhouhan Lin. 2022. RASAT: Integrating Relational Structures into Pretrained Seq2Seq Model for Text-to-SQL. InEMNLP. 3215–3229

  36. [36]

    Nitarshan Rajkumar, Raymond Li, and Dzmitry Bahdanau. 2022. Evaluating the text-to-sql capabilities of large language models.arXiv preprint arXiv:2204.00498 (2022)

  37. [37]

    Nils Reimers and Iryna Gurevych. 2019. Sentence-bert: Sentence embeddings using siamese bert-networks. InEMNLP-IJCNLP. 3982–3992

  38. [38]

    Aaditya Singh, Adam Fry, Adam Perelman, Adam Tart, Adi Ganesh, Ahmed El-Kishky, Aidan McLaughlin, Aiden Low, AJ Ostrow, Akhila Ananthram, et al

  39. [39]

    OpenAI GPT-5 System Card.arXiv preprint arXiv:2601.03267(2025)

  40. [40]

    Shayan Talaei, Mohammadreza Pourreza, Yu-Chen Chang, Azalia Mirhoseini, and Amin Saberi. 2024. CHESS: Contextual Harnessing for Efficient SQL Synthesis. arXiv preprint arXiv:2405.16755(2024)

  41. [41]

    Hugo Touvron, Louis Martin, Kevin Stone, Peter Albert, Amjad Almahairi, Yas- mine Babaei, Nikolay Bashlykov, Soumya Batra, Prajjwal Bhargava, Shruti Bhos- ale, et al. 2023. Llama 2: Open foundation and fine-tuned chat models.arXiv preprint arXiv:2307.09288(2023)

  42. [42]

    Bing Wang, Changyu Ren, Jian Yang, Xinnian Liang, Jiaqi Bai, Linzheng Chai, Zhao Yan, Qian-Wen Zhang, Di Yin, Xing Sun, et al. 2024. Mac-sql: A multi-agent collaborative framework for text-to-sql.arXiv preprint arXiv:2312.11242(2024)

  43. [43]

    Bailin Wang, Richard Shin, Xiaodong Liu, Oleksandr Polozov, and Matthew Richardson. 2020. RAT-SQL: Relation-Aware Schema Encoding and Linking for Text-to-SQL Parsers. InACL. 7567–7578

  44. [44]

    Liang Wang, Nan Yang, Xiaolong Huang, Binxing Jiao, Linjun Yang, Daxin Jiang, Rangan Majumder, and Furu Wei. 2022. Text embeddings by weakly-supervised contrastive pre-training.arXiv preprint arXiv:2212.03533(2022)

  45. [45]

    Wenhui Wang, Furu Wei, Li Dong, Hangbo Bao, Nan Yang, and Ming Zhou

  46. [46]

    Minilm: Deep self-attention distillation for task-agnostic compression of pre-trained transformers.NeurIPS33 (2020), 5776–5788

  47. [47]

    Ziyang Wang, Yuanlei Zheng, Zhenbiao Cao, Xiaojin Zhang, Zhongyu Wei, Pei Fu, Zhenbo Luo, Wei Chen, and Xiang Bai. 2025. AutoLink: Autonomous Schema Exploration and Expansion for Scalable Schema Linking in Text-to-SQL at Scale. arXiv preprint arXiv:2511.17190(2025)

  48. [48]

    Peng Xu, Wei Ping, Xianchao Wu, Lawrence McAfee, Chen Zhu, Zihan Liu, Sandeep Subramanian, Evelina Bakhturina, Mohammad Shoeybi, and Bryan Catanzaro. 2023. Retrieval meets long context large language models.arXiv preprint arXiv:2310.03025(2023)

  49. [49]

    Navid Yaghmazadeh, Yuepeng Wang, Isil Dillig, and Thomas Dillig. 2017. Sqlizer: query synthesis from natural language.PACMPL1 (2017), 1–26

  50. [50]

    Tao Yu, Rui Zhang, Kai Yang, Michihiro Yasunaga, Dongxu Wang, Zifan Li, James Ma, Irene Li, Qingning Yao, Shanelle Roman, Zilin Zhang, and Dragomir Radev. 2018. Spider: A Large-Scale Human-Labeled Dataset for Complex and Cross-Domain Semantic Parsing and Text-to-SQL Task. InEMNLP. 3911–3921

  51. [51]

    Yusen Zhang, Ruoxi Sun, Yanfei Chen, Tomas Pfister, Rui Zhang, and Sercan Arik

  52. [52]

    Chain of agents: Large language models collaborating on long-context tasks.NeurIPS37 (2024), 132208–132237

  53. [53]

    Victor Zhong, Caiming Xiong, and Richard Socher. 2017. Seq2sql: Generating structured queries from natural language using reinforcement learning.arXiv preprint arXiv:1709.00103(2017)