Recognition: unknown
SemanticAgent: A Semantics-Aware Framework for Text-to-SQL Data Synthesis
Pith reviewed 2026-05-09 22:04 UTC · model grok-4.3
The pith
SemanticAgent generates synthetic text-to-SQL data with better semantic validity than prior methods.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
SemanticAgent organizes synthesis around three specialized modules: an analyzer, a synthesizer, and a verifier. Through a three-stage protocol of semantic analysis, stepwise synthesis, and diagnostic refinement, it transforms execution-based validation alone into a traceable reasoning process. The framework generates synthetic data that consistently outperforms prior synthesis methods under semantic-quality evaluation, leading to stronger downstream fine-tuning performance, especially on semantically demanding benchmarks.
What carries the argument
The three-module SemanticAgent system with its analyzer for identifying semantic requirements, synthesizer for step-by-step query creation, and verifier for detecting and correcting semantic errors.
If this is right
- Synthetic datasets produced by SemanticAgent achieve higher scores in semantic quality assessments compared to those from previous approaches.
- Fine-tuned text-to-SQL models using this data exhibit improved accuracy, particularly on challenging benchmarks that require precise semantic matching.
- The diagnostic refinement stage provides a traceable process for ensuring queries align with intended meanings beyond mere executability.
- This method shifts validation from purely syntactic and execution-based checks to include semantic diagnostics.
Where Pith is reading between the lines
- The modular design could inspire similar semantic-aware pipelines in other areas like code generation or natural language to code tasks.
- Widespread adoption might lower reliance on manually created datasets for training database query models.
- Further work could test if the same protocol applies to multilingual or cross-domain text-to-SQL scenarios.
Load-bearing premise
That the verifier module can reliably detect and correct semantic violations without introducing new biases or requiring extensive human oversight.
What would settle it
If models fine-tuned on SemanticAgent-generated data fail to outperform those trained on existing synthetic datasets when tested on benchmarks focused on semantic correctness, the central claim would be undermined.
Figures
read the original abstract
Existing text-to-SQL synthesis pipelines still conflate executability with semantic validity: syntactic checks and execution-based validation can retain queries that execute successfully while violating database semantics. To address these limitations, we propose SemanticAgent, a semantic-aware synthesis framework. SemanticAgent organizes synthesis around three specialized modules: an analyzer, a synthesizer, and a verifier. Through a three-stage protocol of semantic analysis, stepwise synthesis, and diagnostic refinement, SemanticAgent transforms execution-based validation alone into a traceable reasoning process. Our framework generates synthetic data that consistently outperforms prior synthesis methods under semantic-quality evaluation, leading to stronger downstream fine-tuning performance, especially on semantically demanding benchmarks.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript introduces SemanticAgent, a framework for text-to-SQL data synthesis organized around three modules—an analyzer, a synthesizer, and a verifier—operating via a three-stage protocol of semantic analysis, stepwise synthesis, and diagnostic refinement. The central claim is that this approach produces synthetic data that consistently outperforms prior synthesis methods on semantic-quality evaluations and yields stronger downstream fine-tuning performance, particularly on semantically demanding benchmarks.
Significance. If the empirical results hold under rigorous validation, the work would address a recognized limitation in existing text-to-SQL pipelines by moving beyond executability checks to traceable semantic validity. This could improve the reliability of synthetic training data for semantic parsing models and provide a reusable protocol for other data-generation tasks where semantic fidelity matters.
major comments (2)
- [Abstract and Verifier module] Abstract and methods description of the verifier: the claim that the verifier reliably detects and corrects semantic violations (beyond executability) is load-bearing for the outperformance result, yet the manuscript provides no implementation details (schema constraints, external knowledge, or prompting strategy), no ablation isolating the verifier's contribution, and no comparison against human-annotated semantic validity. Without these, it is impossible to rule out that gains arise from the analyzer/synthesizer stages or increased data volume alone.
- [Results and Experiments] Results section on downstream evaluation: the abstract asserts stronger fine-tuning performance on semantically demanding benchmarks, but the provided text supplies no metrics, baseline descriptions, dataset sizes, statistical significance tests, or error bars. This absence prevents assessment of whether the reported improvements are robust or reproducible.
minor comments (1)
- [Framework Overview] Notation for the three-stage protocol could be clarified with a diagram or pseudocode to make the flow from analyzer to verifier more traceable.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed feedback, which highlights important areas for improving the clarity and rigor of our presentation. We address each major comment below and describe the revisions we will make to strengthen the manuscript.
read point-by-point responses
-
Referee: [Abstract and Verifier module] Abstract and methods description of the verifier: the claim that the verifier reliably detects and corrects semantic violations (beyond executability) is load-bearing for the outperformance result, yet the manuscript provides no implementation details (schema constraints, external knowledge, or prompting strategy), no ablation isolating the verifier's contribution, and no comparison against human-annotated semantic validity. Without these, it is impossible to rule out that gains arise from the analyzer/synthesizer stages or increased data volume alone.
Authors: We agree that the current description of the verifier is insufficiently detailed to fully substantiate its role in detecting semantic violations beyond executability. In the revised manuscript, we will expand the methods section with concrete implementation details, including the exact prompting strategies employed, how schema constraints are enforced, and the incorporation of external knowledge sources. We will also add a dedicated ablation study that isolates the verifier's contribution by comparing variants with and without the diagnostic refinement stage. Regarding human-annotated semantic validity, we will include a discussion of this as a limitation and, if feasible within the revision timeline, provide a small-scale comparison or outline a protocol for such validation in future work. These changes will help rule out alternative explanations for the performance gains. revision: yes
-
Referee: [Results and Experiments] Results section on downstream evaluation: the abstract asserts stronger fine-tuning performance on semantically demanding benchmarks, but the provided text supplies no metrics, baseline descriptions, dataset sizes, statistical significance tests, or error bars. This absence prevents assessment of whether the reported improvements are robust or reproducible.
Authors: We acknowledge that the results section in the version reviewed did not sufficiently highlight the quantitative details. The full manuscript does report specific metrics, baseline methods, dataset sizes for synthesis and fine-tuning, and statistical tests; however, to address the concern directly, we will reorganize and expand the results section to explicitly tabulate all performance numbers, describe baselines in detail, state exact dataset sizes, include statistical significance tests (e.g., paired t-tests or Wilcoxon tests), and report error bars or confidence intervals. This revision will make the robustness and reproducibility of the improvements on semantically demanding benchmarks fully transparent and verifiable. revision: yes
Circularity Check
No significant circularity: empirical framework with no derivation chain or self-referential reductions
full rationale
The paper describes an empirical three-module framework (analyzer, synthesizer, verifier) for generating synthetic Text-to-SQL data and reports performance gains on benchmarks. No equations, fitted parameters, predictions derived from inputs, or load-bearing self-citations appear in the abstract or described structure. Claims rest on experimental comparisons rather than any step that reduces by construction to its own definitions or prior author work. The verifier is presented as a diagnostic refinement step without any mathematical formalization that could create self-definition or fitted-input issues.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
A survey on employing large language models for text-to-sql tasks.ACM Comput
Liang Shi, Zhengju Tang, Nan Zhang, Xiaotong Zhang, and Zhi Yang. A survey on employing large language models for text-to-sql tasks.ACM Comput. Surv., 58(2), 2025
2025
-
[2]
Sciencebenchmark: A complex real-world benchmark for evaluating natural language to SQL systems.Proc
Yi Zhang, Jan Deriu, George Katsogiannis-Meimarakis, Catherine Kosten, Georgia Koutrika, and Kurt Stockinger. Sciencebenchmark: A complex real-world benchmark for evaluating natural language to SQL systems.Proc. VLDB Endow., 17(4):685–698, dec 2023
2023
-
[3]
Ehrsql: A practical text-to-SQL benchmark for electronic health records
Gyubok Lee, Hyeonji Hwang, Seongsu Bae, Yeonsu Kwon, Woncheol Shin, Seongjun Yang, Minjoon Seo, Jong-Yeup Kim, and Edward Choi. Ehrsql: A practical text-to-SQL benchmark for electronic health records. In Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, NeurIPS 2022, pages 15589–15601, N...
2022
-
[4]
Grappa: Grammar-augmented pre-training for table semantic parsing, 2021
Tao Yu, Chien-Sheng Wu, Xi Victoria Lin, Bailin Wang, Yi Chern Tan, Xinyi Yang, Dragomir Radev, Richard Socher, and Caiming Xiong. Grappa: Grammar-augmented pre-training for table semantic parsing, 2021
2021
-
[5]
RESDSQL: Decoupling schema linking and skeleton parsing for text-to-SQL
Haoyang Li, Jing Zhang, Cuiping Li, and Hong Chen. RESDSQL: Decoupling schema linking and skeleton parsing for text-to-SQL. InProceedings of the AAAI Conference on Artificial Intelligence, volume 37, pages 13067–13075. AAAI Press, 2023
2023
-
[6]
Generating data for symbolic language with large language models
Jiacheng Ye, Chengzu Li, Lingpeng Kong, and Tao Yu. Generating data for symbolic language with large language models. InProceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 8418–8443, Singapore, 2023. Association for Computational Linguistics
2023
-
[7]
Synthesizing text-to-SQL data from weak and strong LLMs
Jiaxi Yang, Binyuan Hui, Min Yang, Jian Yang, Junyang Lin, and Chang Zhou. Synthesizing text-to-SQL data from weak and strong LLMs. InProceedings of the 62nd Annual Meeting of the Association for Computa- tional Linguistics (Volume 1: Long Papers), pages 7864–7875, Bangkok, Thailand, aug 2024. Association for Computational Linguistics
2024
-
[8]
Omnisql: Synthesizing high-quality text-to-SQL data at scale
Haoyang Li, Shang Wu, Xiaokang Zhang, Xinmei Huang, Jing Zhang, Fuxin Jiang, Shuai Wang, Tieying Zhang, Jianjun Chen, Rui Shi, Hong Chen, and Cuiping Li. Omnisql: Synthesizing high-quality text-to-SQL data at scale. Proc. VLDB Endow., 18(11):4695–4709, 2025
2025
-
[9]
Exesql: Self- taught text-to-SQL models with execution-driven bootstrapping for SQL dialects
Jipeng Zhang, Haolin Yang, Kehao Miao, Ruiyuan Zhang, Renjie Pi, Jiahui Gao, and Xiaofang Zhou. Exesql: Self- taught text-to-SQL models with execution-driven bootstrapping for SQL dialects. InFindings of the Association for Computational Linguistics: EMNLP 2025, pages 24305–24326, Suzhou, China, 2025. Association for Computational Linguistics
2025
-
[10]
XiYan-SQL: A Novel Multi-Generator Framework For Text-to-SQL
Yifu Liu, Yin Zhu, Yingqi Gao, Zhiling Luo, Xiaoxia Li, Xiaorong Shi, Yuntao Hong, Jinyang Gao, Yu Li, Bolin Ding, and Jingren Zhou. Xiyan-sql: A novel multi-generator framework for text-to-SQL. arXiv preprint arXiv:2507.04701, 2025
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[11]
Sql-palm: Improved large lan- guage modeladaptation for text-to-sql
Ruoxi Sun, Sercan Ö. Arik, Hootan Nakhost, Hanjun Dai, Rajarishi Sinha, Pengcheng Yin, and Tomas Pfister. SQL-PaLM: Improved large language model adaptation for text-to-SQL.arXiv preprint arXiv:2306.00739, 2023. 12 PRIME AI paper
-
[12]
SQL-Factory: A multi-agent framework for high-quality and large-scale SQL generation.Proc
Jiahui Li, Tongwang Wu, Yuren Mao, Yunjun Gao, Yajie Feng, and Huaizhong Liu. SQL-Factory: A multi-agent framework for high-quality and large-scale SQL generation.Proc. VLDB Endow., 19(3):292–305, 2025
2025
-
[13]
A study of in-context-learning-based text-to-SQL errors
Jiawei Shen, Chengcheng Wan, Ruoyi Qiao, Jiazhen Zou, Hang Xu, Yuchen Shao, Yueling Zhang, Weikai Miao, and Geguang Pu. A study of in-context-learning-based text-to-SQL errors. arXiv preprint arXiv:2501.09310, 2025
-
[14]
SHARE: An SLM- based hierarchical action corREction assistant for text-to-SQL
Ge Qu, Jinyang Li, Bowen Qin, Xiaolong Li, Nan Huo, Chenhao Ma, and Reynold Cheng. SHARE: An SLM- based hierarchical action corREction assistant for text-to-SQL. InProceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 11268–11292, Vienna, Austria, 2025. Association for Computational Linguistics
2025
-
[15]
Data augmentation with hierarchical SQL-to-question generation for cross-domain text-to-SQL parsing
Kun Wu, Lijie Wang, Zhenghua Li, Ao Zhang, Xinyan Xiao, Hua Wu, Min Zhang, and Haifeng Wang. Data augmentation with hierarchical SQL-to-question generation for cross-domain text-to-SQL parsing. InProceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 8974–8983, Punta Cana, Dominican Republic and Online, 2021. Associa...
2021
-
[16]
SynQL: Synthetic data generation for in-domain, low-resource text-to- SQL parsing
Denver Baumgartner and Tomasz Kornuta. SynQL: Synthetic data generation for in-domain, low-resource text-to- SQL parsing. InProceedings of the Third Table Representation Learning Workshop (TRL 2024), Advances in Neural Information Processing Systems 38, pages 1–12, Vancouver, Canada, December 2024. Curran Associates, Inc
2024
-
[17]
SQLForge: Synthesizing reliable and diverse data to enhance text-to-SQL reasoning in LLMs
Yu Guo, Dong Jin, Shenghao Ye, Shuangwu Chen, Jian Yang, and Xiaobin Tan. SQLForge: Synthesizing reliable and diverse data to enhance text-to-SQL reasoning in LLMs. InFindings of the Association for Computational Linguistics: ACL 2025, pages 8441–8452, Vienna, Austria, 2025. Association for Computational Linguistics
2025
-
[18]
Hierarchical neural data synthesis for semantic parsing
Wei Yang, Peng Xu, and Yanshuai Cao. Hierarchical neural data synthesis for semantic parsing. arXiv preprint arXiv:2112.02212, 2021
-
[19]
Question generation from sql queries improves neural semantic parsing
Daya Guo, Yibo Sun, Duyu Tang, Nan Duan, Jian Yin, Hong Chi, James Cao, Peng Chen, and Ming Zhou. Question generation from sql queries improves neural semantic parsing. InProceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 1597–1607. Association for Computational Linguistics, 2018
2018
-
[20]
REFORMER: A ChatGPT-driven data synthesis framework elevating text-to-SQL models
Shenyang Liu, Saleh Almohaimeed, Yijing Dai, Jiahua Lv, Yibo Chen, Yifei Wang, Xu Han, Renqi Zhao, Changrong Wang, Yujing Xie, Zhiguo Gu, and Liqiang Wang. REFORMER: A ChatGPT-driven data synthesis framework elevating text-to-SQL models. In2024 IEEE 23rd International Conference on Machine Learning and Applications (ICMLA), pages 828–833, Miami, FL, USA, ...
2024
-
[21]
Hasan Alp Cafero ˘glu, Mehmet Serhat Çelik, and Özgür Ulusoy. SING-SQL: A synthetic data generation framework for in-domain text-to-SQL translation.arXiv preprint arXiv:2509.25672, 2025
-
[22]
Knowledge-to-sql: Enhancing sql generation with data expert LLM
Zijin Hong, Zheng Yuan, Hao Chen, Qinggang Zhang, Feiran Huang, and Xiao Huang. Knowledge-to-sql: Enhancing sql generation with data expert LLM. InFindings of the Association for Computational Linguistics: ACL 2024, pages 10997–11008, Bangkok, Thailand, aug 2024. Association for Computational Linguistics
2024
-
[23]
Bridging the gap between text-to-sql research and real-world applications: A unified all-in-one framework for text-to-sql.Knowledge-Based Systems, 306:112697, dec 2024
Mirae Han, Seongsik Park, Seulgi Kim, and Harksoo Kim. Bridging the gap between text-to-sql research and real-world applications: A unified all-in-one framework for text-to-sql.Knowledge-Based Systems, 306:112697, dec 2024
2024
-
[24]
Linkalign: Scalable schema linking for real-world large-scale multi- database text-to-SQL
Yihan Wang, Peiyu Liu, and Xin Yang. Linkalign: Scalable schema linking for real-world large-scale multi- database text-to-SQL. InProceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pages 977–991, Suzhou, China, nov 2025. Association for Computational Linguistics
2025
-
[25]
Extractive schema linking for text-to-sql.arXiv preprint arXiv:2501.17174, 2025
Michael Glass, Mustafa Eyceoz, Dharmashankar Subramanian, Gaetano Rossiello, Long Vu, and Alfio Gliozzo. Extractive schema linking for text-to-sql.arXiv preprint arXiv:2501.17174, 2025
-
[26]
A confidence-based knowledge integration framework for cross-domain table question answering.Knowledge-Based Systems, 306:112718, dec 2024
Yuankai Fan, Tonghui Ren, Can Huang, Beini Zheng, Yinan Jing, Zhenying He, Jinbao Li, and Jianxin Li. A confidence-based knowledge integration framework for cross-domain table question answering.Knowledge-Based Systems, 306:112718, dec 2024
2024
-
[27]
A multi-pattern retrieval- augmented architecture for text-to-SQL semantic parsing.Information Processing & Management, 62(2):103975, mar 2025
Zhiming Guo, Yuqiang Wang, Zulong Zhu, Jialin Zhang, Wei Peng, and Chaozhuo Lin. A multi-pattern retrieval- augmented architecture for text-to-SQL semantic parsing.Information Processing & Management, 62(2):103975, mar 2025
2025
-
[28]
Structure-guided large language models for text-to-SQL generation
Qinggang Zhang, Hao Chen, Junnan Dong, Shengyuan Chen, Feiran Huang, and Xiao Huang. Structure-guided large language models for text-to-SQL generation. InProceedings of the 42nd International Conference on Machine Learning, volume 267 ofProceedings of Machine Learning Research, pages 74671–74691. PMLR, jul 2025. 13 PRIME AI paper
2025
-
[29]
Chain-of-thought prompting elicits reasoning in large language models
Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Brian Ichter, Fei Xia, Ed Chi, Quoc Le, and Denny Zhou. Chain-of-thought prompting elicits reasoning in large language models. InAdvances in Neural Information Processing Systems, volume 35, pages 24824–24837. Curran Associates, Inc., 2022
2022
-
[30]
How to prompt LLMs for text-to-SQL: A study in zero-shot, single- domain, and cross-domain settings
Shuaichen Chang and Eric Fosler-Lussier. How to prompt LLMs for text-to-SQL: A study in zero-shot, single- domain, and cross-domain settings. InNeurIPS 2023 Second Table Representation Learning Workshop, 2023. arXiv preprint arXiv:2305.11853
-
[31]
Chain-of-query: Unleashing the power of LLMs in SQL-aided table understanding via multi-agent collaboration
Songyuan Sui, Hongyi Liu, Serena Liu, Li Li, Soo-Hyun Choi, Rui Chen, and Xia Hu. Chain-of-query: Unleashing the power of LLMs in SQL-aided table understanding via multi-agent collaboration. InProceedings of the International Joint Conference on Natural Language Processing (IJCNLP), pages 628–644, Kuala Lumpur, Malaysia, dec 2025. Association for Computat...
2025
-
[32]
Parsql: Enhancing text-to-sql through sql parsing and reasoning
Yaxun Dai, Haiqin Yang, Mou Hao, and Pingfu Chao. Parsql: Enhancing text-to-sql through sql parsing and reasoning. InFindings of the Association for Computational Linguistics: ACL 2025, pages 661–681, Vienna, Austria, jul 2025. Association for Computational Linguistics
2025
-
[33]
MAGMA: A Multi-Graph based Agentic Memory Architecture for AI Agents
Dongming Jiang, Yi Li, Guanpeng Li, and Bingzhe Li. Magma: A multi-graph based agentic memory architecture for ai agents.arXiv preprint arXiv:2601.03236, 2026
work page internal anchor Pith review Pith/arXiv arXiv 2026
-
[34]
Chain-of-program prompting with open-source large language models for text-to-sql
Bo Xu, Shufei Li, Yifei Wu, Shouang Wei, Ming Du, Hongya Wang, and Hui Song. Chain-of-program prompting with open-source large language models for text-to-sql. In2024 International Joint Conference on Neural Networks (IJCNN), pages 1–8, 2024
2024
-
[35]
Mohammadreza Pourreza, Hailong Li, Ruoxi Sun, Yeounoh Chung, Shayan Talaei, Gaurav Tarlok Kakkar, Yu Gan, Amin Saberi, Fatma Ozcan, and Sercan O. Arik. Chase-sql: Multi-path reasoning and preference optimized candidate selection in text-to-SQL. InThe Thirteenth International Conference on Learning Representations (ICLR 2025), Singapore, apr 2025. OpenReview.net
2025
-
[36]
Spider: A large-scale human-labeled dataset for complex and cross-domain semantic parsing and text-to-SQL task
Tao Yu, Rui Zhang, Kai Yang, Michihiro Yasunaga, Dongxu Wang, Zifan Li, James Ma, Irene Li, Qingning Yao, Shanelle Roman, Zilin Zhang, and Dragomir Radev. Spider: A large-scale human-labeled dataset for complex and cross-domain semantic parsing and text-to-SQL task. InProceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, ...
2018
-
[37]
Can LLM already serve as a database interface? A BIg bench for large-scale database grounded text-to-SQLs
Jinyang Li, Binyuan Hui, Ge Qu, Jiaxi Yang, Binhua Li, Bowen Li, Bailin Wang, Bowen Qin, Ruiying Geng, Nan Huo, Xuanhe Zhou, Chenhao Ma, Guoliang Li, Kevin Chen-Chuan Chang, Fei Huang, Reynold Cheng, and Yongbin Li. Can LLM already serve as a database interface? A BIg bench for large-scale database grounded text-to-SQLs. InAdvances in Neural Information P...
2023
-
[38]
arXiv preprint arXiv:2411.07763 (2024)
Fangyu Lei, Jixuan Chen, Yuxiao Ye, Ruisheng Cao, Dongchan Shin, Hongjin Su, Zhaoqing Suo, Hongcheng Gao, Wenjing Hu, Pengcheng Yin, Victor Zhong, Caiming Xiong, Ruoxi Sun, Qian Liu, Sida Wang, and Tao Yu. Spider 2.0: Evaluating language models on real-world enterprise text-to-SQL workflows.arXiv preprint arXiv:2411.07763, 2024
-
[39]
Woodward, Jinxia Xie, and Pengsheng Huang
Yujian Gan, Xinyun Chen, Qiuping Huang, Matthew Purver, John R. Woodward, Jinxia Xie, and Pengsheng Huang. Towards robustness of text-to-SQL models against synonym substitution. InProceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Lon...
2021
-
[40]
Structure-grounded pretraining for text-to-SQL
Xiang Deng, Ahmed Hassan Awadallah, Christopher Meek, Oleksandr Polozov, Huan Sun, and Matthew Richard- son. Structure-grounded pretraining for text-to-SQL. InProceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 1337–1350, Online, jun 2021. Association for Com...
2021
-
[41]
Exploring underexplored limitations of cross-domain text-to-SQL generalization
Yujian Gan, Xinyun Chen, and Matthew Purver. Exploring underexplored limitations of cross-domain text-to-SQL generalization. InProceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 8926–8931, Punta Cana, Dominican Republic and Online, nov 2021. Association for Computational Linguistics
2021
-
[42]
CodeS: Towards building open-source language models for text-to-SQL.Proc
Haoyang Li, Jing Zhang, Hanbing Liu, Ju Fan, Xiaokang Zhang, Jun Zhu, Renjie Wei, Hongyan Pan, Cuiping Li, and Hong Chen. CodeS: Towards building open-source language models for text-to-SQL.Proc. ACM Manag. Data, 2(3), jun 2024
2024
-
[43]
Semantic evaluation for text-to-SQL with distilled test suites
Ruiqi Zhong, Tao Yu, and Dan Klein. Semantic evaluation for text-to-SQL with distilled test suites. InProceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, pages 396–411, Online, November
2020
-
[44]
14 PRIME AI paper
Association for Computational Linguistics. 14 PRIME AI paper
-
[45]
The Qwen Team. Qwen3 technical report.arXiv preprint arXiv:2505.09388, 2025
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[46]
Efficient memory management for large language model serving with pagedattention
Woosuk Kwon, Zhuohan Li, Siyuan Zhuang, Ying Sheng, Lianmin Zheng, Cody Hao Yu, Joseph Gonzalez, Hao Zhang, and Ion Stoica. Efficient memory management for large language model serving with pagedattention. In Proceedings of the 29th Symposium on Operating Systems Principles, pages 611–626, Koblenz, Germany, oct
-
[47]
Qwen2.5- coder technical report, 2024
Binyuan Hui, Jian Yang, Zeyu Cui, Jiaxi Yang, Dayiheng Liu, Lei Zhang, Tianyu Liu, Jiajun Zhang, Bowen Yu, Keming Lu, Kai Dang, Yang Fan, Yichang Zhang, An Yang, Rui Men, Fei Huang, Bo Zheng, Yibo Miao, Shanghaoran Quan, Yunlong Feng, Xingzhang Ren, Xuancheng Ren, Jingren Zhou, and Junyang Lin. Qwen2.5- coder technical report, 2024
2024
-
[48]
SWIFT: A scalable lightweight infrastructure for fine-tuning.Proceedings of the AAAI Conference on Artificial Intelligence, 39(28):29733–29735, apr 2025
Yuze Zhao, Jintao Huang, Jinghan Hu, Xingjun Wang, Yunlin Mao, Daoze Zhang, Zeyinzi Jiang, Zhikai Wu, Baole Ai, Ang Wang, Wenmeng Zhou, and Yingda Chen. SWIFT: A scalable lightweight infrastructure for fine-tuning.Proceedings of the AAAI Conference on Artificial Intelligence, 39(28):29733–29735, apr 2025
2025
-
[49]
Zero-infinity: Breaking the GPU memory wall for extreme scale deep learning
Samyam Rajbhandari, Olatunji Ruwase, Jeff Rasley, Shaden Smith, and Yuxiong He. Zero-infinity: Breaking the GPU memory wall for extreme scale deep learning. InProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, pages 1–14. ACM/IEEE, 2021
2021
-
[50]
Decoupled weight decay regularization
Ilya Loshchilov and Frank Hutter. Decoupled weight decay regularization. In7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA, May 6-9, 2019. OpenReview.net, 2019
2019
-
[51]
An Yang et al. Qwen2.5 technical report.arXiv preprint arXiv:2412.15115, 2024
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[52]
Aixin Liu, Bei Feng, Bing Xue, Bingxuan Wang, Bochao Wu, Chengda Lu, Chenggang Zhao, Chengqi Deng, Chenyu Zhang, Chong Ruan, et al. DeepSeek-V3 technical report.arXiv preprint arXiv:2412.19437, 2024
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[53]
DIN-SQL: Decomposed in-context learning of text-to-sql with self-correction
Mohammadreza Pourreza and Davood Rafiei. DIN-SQL: Decomposed in-context learning of text-to-sql with self-correction. In Alice Oh, Tristan Naumann, Amir Globerson, Kate Saenko, Moritz Hardt, and Sergey Levine, editors,Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023 (NeurIPS 2023), pages...
2023
-
[54]
arXiv preprint arXiv:2405.16755 (2024)
Shayan Talaei, Mohammadreza Pourreza, Yu-Chen Chang, Azalia Mirhoseini, and Amin Saberi. CHESS: Contextual harnessing for efficient SQL synthesis.arXiv preprint arXiv:2405.16755, 2024. 15 PRIME AI paper .1 ER Analysis Table 8: Entity identifiers, categorical descriptors, and quantitative indicators for text-to-SQL synthesis. Entity Type Description School...
-
[55]
Parse the question to identify required information
-
[56]
Check if SQL uses correct tables/columns from schema
-
[57]
Verify joins, filters, and aggregations match the question intent
-
[58]
label”: 0 or 1, “reasoning
Confirm the output answers the question completely Output: {“label”: 0 or 1, “reasoning”: brief explanation} Database Schema: {schema} Natural Language Question: {question} SQL Query: {sql_query} Figure 7: The prompt used for Question-SQL semantic consistency verification. The model evaluates whether the SQL query correctly answers the question given the ...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.