Evaluating the Practical Effectiveness of LLM-Driven Index Tuning with Microsoft Database Tuning Advisor

Surajit Chaudhuri; Vivek Narasayya; Wentao Wu; Xiaoying Wang

arxiv: 2603.09181 · v2 · submitted 2026-03-10 · 💻 cs.DB

Evaluating the Practical Effectiveness of LLM-Driven Index Tuning with Microsoft Database Tuning Advisor

Xiaoying Wang , Wentao Wu , Vivek Narasayya , Surajit Chaudhuri This is my paper

Pith reviewed 2026-05-15 13:54 UTC · model grok-4.3

classification 💻 cs.DB

keywords index tuningLLMDatabase Tuning AdvisorDTAquery optimizationexecution timeperformance tuningSQL Server

0 comments

The pith

LLM-driven index tuning can find configurations that significantly outperform DTA in execution time for a considerable number of cases, though DTA is generally more reliable.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper evaluates large language model approaches to recommending database indexes against Microsoft's Database Tuning Advisor (DTA) using both standard industrial benchmarks and real enterprise customer workloads. It measures success by actual query execution time after applying the suggested indexes. The results show DTA produces more consistent recommendations overall, but LLMs can discover superior index sets in many cases after only a few attempts. LLM reasoning often mirrors intuitive human judgments about index choices, which could be extracted to strengthen traditional methods. Direct production use faces barriers from high performance variance, weak gains when fused with DTA, and expensive validation steps.

Core claim

Although DTA is generally more reliable, with a few invocations, LLM can identify configurations that significantly outperform those found by DTA in execution time in a considerable number of cases, highlighting its potential as a complementary technique. We also observe that LLM's reasoning captures human-intuitive insights that may be distilled to potentially improve DTA. However, adopting LLM-driven index tuning in production remains challenging due to its substantial performance variance, limited and often negative impact when directly integrated into DTA, and the high cost of performance validation.

What carries the argument

Comparison of index configurations from DTA's what-if API cost estimates versus LLM suggestions, validated through actual query execution times on benchmarks and customer workloads.

Load-bearing premise

The tested benchmarks and real-world customer workloads are representative, and measured execution time improvements accurately reflect production benefits without unaccounted confounding factors like hardware variation.

What would settle it

Repeating the experiments on a fresh collection of customer workloads and finding that LLM fails to produce outperforming configurations in a comparable fraction of cases would falsify the claim.

Figures

Figures reproduced from arXiv: 2603.09181 by Surajit Chaudhuri, Vivek Narasayya, Wentao Wu, Xiaoying Wang.

**Figure 1.** Figure 1: Single-query workload prompt template. 2 METHODOLOGY 2.1 Benchmark and Customer Workloads Unlike existing work that has been primarily evaluated using public benchmarks with synthetic data and queries, we focus on real-world customer workloads in our study [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗

**Figure 3.** Figure 3: Multi-query workload prompt template. Remark. Our goal is not to identify an optimal prompt, but to provide LLM with basic information that a human expert would need to make informed index recommendations. In this study, we focus on evaluating the fundamental capability of LLM for index tuning, given only rudimentary information. More variations of LLM-driven index tuning are discussed in Section 7. 2.3.2 … view at source ↗

**Figure 4.** Figure 4: LLM-driven index tuning (best) vs. DTA for tuning single-query workloads (marker denotes the worst run of LLM). [PITH_FULL_IMAGE:figures/full_fig_p005_4.png] view at source ↗

**Figure 5.** Figure 5: Comparison of index usage between LLM-driven index tuning and DTA for tuning single-query workloads. [PITH_FULL_IMAGE:figures/full_fig_p006_5.png] view at source ↗

**Figure 6.** Figure 6: Comparison of the estimated costs between LLM [PITH_FULL_IMAGE:figures/full_fig_p007_6.png] view at source ↗

**Figure 7.** Figure 7: Examples of GPT-5’s reasoning process. In this section, we investigate this question in detail. We start by examining the underlying reasoning processes of GPT-5 for tuning single-query workloads. We observe that GPT-5’s reasoning follows several intuitive insights that align with human judgment and can be summarized into a few rules of thumb. Based on this observation, we explore whether these insights ca… view at source ↗

**Figure 8.** Figure 8: Evaluation of the simple index tuner (y-axis: [PITH_FULL_IMAGE:figures/full_fig_p008_8.png] view at source ↗

**Figure 9.** Figure 9: LLM-driven index tuning (five invocations shown as bars) vs. DTA (red line) for tuning multi-query workloads. [PITH_FULL_IMAGE:figures/full_fig_p009_9.png] view at source ↗

**Figure 10.** Figure 10: Analysis of Real-D 𝐾 = 10. LLM’s indexes occupy only 7.7 GB. This shows that LLM does not achieve its performance by inflating the storage, indicating that its recommendation is indeed superior in this case. Per-query Improvement. Beyond total execution time, we also evaluated the benefit of the index recommendations at the perquery level [PITH_FULL_IMAGE:figures/full_fig_p010_10.png] view at source ↗

**Figure 12.** Figure 12: Time breakdown for performance validation. [PITH_FULL_IMAGE:figures/full_fig_p011_12.png] view at source ↗

read the original abstract

Index tuning is critical for the performance of modern database systems. Industrial index tuners, such as the Database Tuning Advisor (DTA) developed for Microsoft SQL Server, rely on the "what-if" API provided by the query optimizer to estimate the cost of a query given an index configuration, which can lead to suboptimal recommendations when the estimations are inaccurate. Large language model (LLM) offers a new approach to index tuning, with knowledge learned from web-scale training datasets. However, the effectiveness of LLM-driven index tuning, especially beyond what is already achieved by commercial index tuners, remains unclear. In this paper, we study the practical effectiveness of LLM-driven index tuning using both industrial benchmarks and real-world enterprise customer workloads, and compare it with DTA. Our results show that although DTA is generally more reliable, with a few invocations, LLM can identify configurations that significantly outperform those found by DTA in execution time in a considerable number of cases, highlighting its potential as a complementary technique. We also observe that LLM's reasoning captures human-intuitive insights that may be distilled to potentially improve DTA. However, adopting LLM-driven index tuning in production remains challenging due to its substantial performance variance, limited and often negative impact when directly integrated into DTA, and the high cost of performance validation. This work provides motivation, lessons, and practical insights that will inspire future work on LLM-driven index tuning both in academia and industry.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

LLMs can beat DTA on some real workloads but the outperformance claims lack the measurement controls needed to trust them.

read the letter

The main thing here is that LLMs sometimes find index configurations that run faster than DTA's on both benchmarks and customer workloads, but DTA stays more reliable overall and direct integration does not help much. The authors also pull out some intuitive reasoning patterns from the LLM that could be fed back into traditional tuners, which is a useful observation. They close with clear notes on variance, validation cost, and why production use is still hard. That practical framing is the paper's real contribution. It is new in running the comparison on actual enterprise workloads rather than just TPC-style benchmarks, and it gives concrete lessons on complementarity instead of just claiming LLMs are better. The discussion of why LLM suggestions do not slot cleanly into DTA is honest and worth having on record. The soft spot is the experimental detail. The abstract and stress-test note both flag missing information on repeated runs, cold-cache protocols, buffer-pool controls, or statistical tests for the execution-time wins. Without those, it is difficult to rule out noise or workload-specific artifacts as the source of the reported speedups. If the full paper has those controls, the results become much stronger; if not, the central claim needs more work. This is for database systems people who already use or build index tuners and want early evidence on LLM hybrids. Readers who care about real deployment friction will get value from the lessons even if they stay skeptical of the wins. It deserves a serious referee because the workload data and the integration challenges are worth referee time, though the authors should expect requests for tighter methodology and more runs before acceptance.

Referee Report

2 major / 1 minor

Summary. The paper evaluates the practical effectiveness of LLM-driven index tuning compared to Microsoft's Database Tuning Advisor (DTA) using industrial benchmarks and real-world enterprise customer workloads. It claims that while DTA is generally more reliable, LLMs can with a few invocations identify index configurations that significantly outperform DTA in execution time in a considerable number of cases; LLM reasoning also captures human-intuitive insights that could improve DTA, though production adoption is challenged by performance variance, limited integration benefits, and high validation costs.

Significance. If the results hold under rigorous controls, the work is significant for providing empirical evidence on LLM-based index tuning as a potential complement to commercial tools like DTA. The inclusion of real-world customer workloads is a notable strength, offering practical lessons on variance and adoption barriers that could guide hybrid tuning systems in database research.

major comments (2)

[§4] §4 (Experimental Evaluation): The comparative results on execution-time outperformance lack any description of measurement protocols, including whether runs were repeated, cold-cache conditions were enforced, statistical significance was tested, error bars reported, or controls applied for buffer-pool state, concurrent load, and hardware variation. This directly undermines the central claim that LLM configurations significantly outperform DTA, as unaccounted noise could explain the reported wins.
[§5] §5 (Discussion of Insights): The assertion that LLM reasoning captures human-intuitive insights lacks concrete examples from the workloads or a quantified analysis of how these could be distilled to improve DTA, rendering the complementarity argument anecdotal rather than evidence-based.

minor comments (1)

[Abstract] Abstract: The term 'considerable number of cases' is imprecise; stating the exact fraction or count of workloads where LLM outperforms DTA would strengthen clarity without altering the narrative.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the thoughtful and constructive review. The comments highlight important areas for improving the clarity and rigor of our experimental and discussion sections. We address each major comment below and will revise the manuscript to incorporate the suggested enhancements.

read point-by-point responses

Referee: [§4] §4 (Experimental Evaluation): The comparative results on execution-time outperformance lack any description of measurement protocols, including whether runs were repeated, cold-cache conditions were enforced, statistical significance was tested, error bars reported, or controls applied for buffer-pool state, concurrent load, and hardware variation. This directly undermines the central claim that LLM configurations significantly outperform DTA, as unaccounted noise could explain the reported wins.

Authors: We agree that the original manuscript did not provide sufficient detail on the measurement protocols, which is a valid concern for validating the execution-time comparisons. In the revised version, we will expand the experimental setup subsection in §4 to explicitly describe: (1) each configuration was executed five times with the buffer pool flushed between runs to enforce cold-cache conditions; (2) statistical significance was evaluated using paired t-tests (p < 0.05); (3) error bars in figures represent one standard deviation; and (4) controls included dedicated hardware with no concurrent workloads and fixed server configurations to minimize variation. These additions will strengthen the reliability of the outperformance claims without altering the reported results. revision: yes
Referee: [§5] §5 (Discussion of Insights): The assertion that LLM reasoning captures human-intuitive insights lacks concrete examples from the workloads or a quantified analysis of how these could be distilled to improve DTA, rendering the complementarity argument anecdotal rather than evidence-based.

Authors: We acknowledge that the discussion in §5 would benefit from greater specificity. In the revision, we will include concrete examples drawn from the enterprise customer workloads, such as cases where the LLM recommended covering indexes for multi-column join patterns that aligned with common DBA practices but were not selected by DTA's cost model. However, a full quantified analysis of distilling these insights into modifications for DTA would require new experiments and implementation work that extends beyond the current study's scope; we will explicitly note this limitation and position it as an avenue for future research to make the complementarity argument more rigorous. revision: partial

Circularity Check

0 steps flagged

No circularity: empirical evaluation relies on external benchmarks and workloads

full rationale

This paper is a pure empirical study comparing LLM index tuning against DTA on industrial benchmarks and real customer workloads. No mathematical derivations, fitted parameters renamed as predictions, or load-bearing self-citations appear in the derivation chain. All claims rest on direct experimental measurements against external data sources, with no reduction of outputs to inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

This is an empirical evaluation paper with no mathematical derivations, fitted parameters, or new postulated entities.

pith-pipeline@v0.9.0 · 5566 in / 1116 out tokens · 52808 ms · 2026-05-15T13:54:45.820888+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

87 extracted references · 87 canonical work pages · 5 internal anchors

[1]

Microsoft SQL Server Missing Indexes

2025. Microsoft SQL Server Missing Indexes. https://learn.microsoft.com/en- us/sql/relational-databases/indexes/tune-nonclustered-missing-index- suggestions?view=sql-server-ver17

work page 2025
[2]

Marathe, Vivek R

Sanjay Agrawal, Surajit Chaudhuri, Lubor Kollár, Arunprasad P. Marathe, Vivek R. Narasayya, and Manoj Syamala. 2004. Database Tuning Advisor for Microsoft SQL Server 2005. InVLDB. 1110–1121

work page 2004
[3]

Narasayya

Sanjay Agrawal, Surajit Chaudhuri, and Vivek R. Narasayya. 2001. Materialized View and Index Selection Tool for Microsoft SQL Server 2000. InSIGMOD

work page 2001
[4]

Dana Van Aken et al. 2021. An Inquiry into Machine Learning-based Automatic Configuration Tuning Services on Real-World Database Management Systems. Proc. VLDB Endow.14, 7 (2021), 1241–1253

work page 2021
[5]

Gordon, and Bohan Zhang

Dana Van Aken, Andrew Pavlo, Geoffrey J. Gordon, and Bohan Zhang. 2017. Automatic Database Management System Tuning Through Large-scale Machine Learning. InSIGMOD. ACM, 1009–1024

work page 2017
[6]

Peter Akioyamen, Zixuan Yi, and Ryan Marcus. 2024. The Unreasonable Effec- tiveness of LLMs for Query Optimization.CoRRabs/2411.02862 (2024)

work page arXiv 2024
[7]

Matteo Brucato, Tarique Siddiqui, Wentao Wu, Vivek Narasayya, and Surajit Chaudhuri. 2024. Wred: Workload Reduction for Scalable Index Tuning.Proc. ACM Manag. Data2, 1, Article 50 (2024), 26 pages

work page 2024
[8]

Nicolas Bruno and Surajit Chaudhuri. 2005. Automatic Physical Database Tuning: A Relaxation-based Approach. InSIGMOD. 227–238

work page 2005
[9]

Stefano Cereda et al. 2021. CGPTuner: a Contextual Gaussian Process Bandit Ap- proach for the Automatic Tuning of IT Configurations Under Varying Workload Conditions.Proc. VLDB Endow.14, 8 (2021), 1401–1413

work page 2021
[10]

Sunil Chakkappen et al. 2025. Automatic Indexing in Oracle.Proc. VLDB Endow. 18, 12 (2025), 4924–4937

work page 2025
[11]

Narasayya

Surajit Chaudhuri, Ashish Kumar Gupta, and Vivek R. Narasayya. 2002. Com- pressing SQL workloads. InSIGMOD. 488–499

work page 2002
[12]

Surajit Chaudhuri and Vivek Narasayya. 2020. Anytime Algorithm of Database Tuning Advisor for Microsoft SQL Server

work page 2020
[13]

Narasayya

Surajit Chaudhuri and Vivek R. Narasayya. 1997. An Efficient Cost-Driven Index Selection Tool for Microsoft SQL Server. InVLDB. 146–155

work page 1997
[14]

Narasayya

Surajit Chaudhuri and Vivek R. Narasayya. 1998. AutoAdmin ’What-if’ Index Analysis Utility. InSIGMOD. 367–378

work page 1998
[15]

Narasayya

Surajit Chaudhuri and Vivek R. Narasayya. 1999. Index Merging. InICDE

work page 1999
[16]

Sudipto Das et al. 2019. Automatically Indexing Millions of Databases in Mi- crosoft Azure SQL Database. InSIGMOD. 666–679

work page 2019
[17]

Debabrata Dash, Neoklis Polyzotis, and Anastasia Ailamaki. 2011. CoPhy: A Scalable, Portable, and Interactive Index Advisor for Large Workloads.Proc. VLDB Endow.4, 6 (2011), 362–372

work page 2011
[18]

Naughton, and Stratis Viglas

Shaleen Deep, Anja Gruenheid, Paraschos Koutris, Jeffrey F. Naughton, and Stratis Viglas. 2020. Comprehensive and Efficient Workload Compression.Proc. VLDB Endow.14, 3 (2020), 418–430

work page 2020
[19]

Sriram Dharwada, Himanshu Devrani, Jayant Haritsa, and Harish Doraiswamy

work page
[20]

Query rewriting via llms.arXiv preprint arXiv:2502.12918(2025)

work page arXiv 2025
[21]

Narasayya

Bailu Ding, Sudipto Das, Ryan Marcus, Wentao Wu, Surajit Chaudhuri, and Vivek R. Narasayya. 2019. AI Meets AI: Leveraging Query Executions to Improve Index Recommendations. InSIGMOD. 1241–1258

work page 2019
[22]

Narasayya

Bailu Ding, Sudipto Das, Wentao Wu, Surajit Chaudhuri, and Vivek R. Narasayya

work page
[23]

VLDB Endow.11, 10 (2018), 1123–1136

Plan Stitch: Harnessing the Best of Many Plans.Proc. VLDB Endow.11, 10 (2018), 1123–1136

work page 2018
[24]

Yufeng Du, Minyang Tian, Srikanth Ronanki, Subendhu Rongali, Sravan Babu Bodapati, Aram Galstyan, Azton Wells, Roy Schwartz, Eliu A Huerta, and Hao Peng. 2025. Context Length Alone Hurts LLM Performance Despite Perfect Retrieval. InFindings of the Association for Computational Linguistics: EMNLP 2025, Christos Christodoulopoulos, Tanmoy Chakraborty, Carol...

work page doi:10.18653/v1/2025.findings-emnlp.1264 2025
[25]

Songyun Duan et al . 2009. Tuning Database Configuration Parameters with iTuned.Proc. VLDB Endow.2, 1 (2009), 1246–1257

work page 2009
[26]

Victor Giannakouris and Immanuel Trummer. 2024. DBG-PT: A Large Language Model Assisted Query Performance Regression Debugger.Proc. VLDB Endow.17, 12 (2024), 4337–4340. https://www.vldb.org/pvldb/vol17/p4337-giannakouris.pdf

work page 2024
[27]

Victor Giannakouris and Immanuel Trummer. 2025. 𝜆-Tune: Harnessing Large Language Models for Automated Database System Tuning.Proc. ACM Manag. Data3, 1 (2025), 2:1–2:26

work page 2025
[28]

Goetz Graefe. 1995. The Cascades Framework for Query Optimization.IEEE Data Eng. Bull.18, 3 (1995), 19–29

work page 1995
[29]

Yuxian Gu, Li Dong, Furu Wei, and Minlie Huang. 2024. MiniLLM: Knowledge Distillation of Large Language Models. InThe Twelfth International Conference on Learning Representations. https://openreview.net/forum?id=5h0qf7IBZZ

work page 2024
[30]

Beliz Gunel, Jingfei Du, Alexis Conneau, and Veselin Stoyanov. 2021. Supervised Contrastive Learning for Pre-trained Language Model Fine-tuning. In9th Inter- national Conference on Learning Representations, ICLR 2021, Virtual Event, Austria, May 3-7, 2021. OpenReview.net. https://openreview.net/forum?id=cu7IUiOhujH

work page 2021
[31]

Daya Guo, Dejian Yang, Haowei Zhang, Junxiao Song, Ruoyu Zhang, Runxin Xu, Qihao Zhu, Shirong Ma, Peiyi Wang, Xiao Bi, et al . 2025. Deepseek-r1: Incentivizing reasoning capability in llms via reinforcement learning.arXiv preprint arXiv:2501.12948(2025)

work page internal anchor Pith review Pith/arXiv arXiv 2025
[32]

Aaron Hurst, Adam Lerer, Adam P Goucher, Adam Perelman, Aditya Ramesh, Aidan Clark, AJ Ostrow, Akila Welihinda, Alan Hayes, Alec Radford, et al. 2024. Gpt-4o system card.arXiv preprint arXiv:2410.21276(2024)

work page internal anchor Pith review Pith/arXiv arXiv 2024
[33]

Ioannidis and Stavros Christodoulakis

Yannis E. Ioannidis and Stavros Christodoulakis. 1991. On the Propagation of Errors in the Size of Join Results. InSIGMOD. 268–277

work page 1991
[34]

Kunal Jha, Wilka Carvalho, Yancheng Liang, Simon Shaolei Du, Max Kleiman- Weiner, and Natasha Jaques. 2025. Cross-environment Cooperation Enables Zero-shot Multi-agent Coordination. InICML

work page 2025
[35]

Andrew Kane. 2017. Introducing Dexter, the Automatic Indexer for Post- gres. https://medium.com/@ankane/introducing-dexter-the-automatic-indexer- for-postgres-5f8fa8b28f27

work page 2017
[36]

Konstantinos Kanellis et al. 2022. LlamaTune: Sample-Efficient DBMS Configu- ration Tuning.Proc. VLDB Endow.15, 11 (2022), 2953–2965

work page 2022
[37]

Jan Kossmann, Stefan Halfpap, Marcel Jankrift, and Rainer Schlosser. 2020. Magic mirror in my hand, which is the best in the land? An Experimental Evaluation of Index Selection Algorithms.Proc. VLDB Endow.13, 11 (2020), 2382–2395

work page 2020
[38]

Jiale Lao et al. 2025. GPTuner: An LLM-Based Database Tuning System.SIGMOD Rec.54, 1 (2025), 101–110

work page 2025
[39]

Viktor Leis et al. 2015. How Good Are Query Optimizers, Really?PVLDB9, 3 (2015), 204–215

work page 2015
[40]

Ao Li, Yuexiang Xie, Songze Li, Fugee Tsung, Bolin Ding, and Yaliang Li. 2025. Agent-Oriented Planning in Multi-Agent Systems. InThe Thirteenth International Conference on Learning Representations, ICLR 2025, Singapore, April 24-28, 2025. OpenReview.net. https://openreview.net/forum?id=EqcLAU6gyU

work page 2025
[41]

Guoliang Li, Xuanhe Zhou, Shifu Li, and Bo Gao. 2019. QTune: A Query-Aware Database Tuning System with Deep Reinforcement Learning.Proc. VLDB Endow. 12, 12 (2019), 2118–2130

work page 2019
[42]

Zhaodonghui Li, Haitao Yuan, Jiachen Shi, Hao Zhang, Yu Rong, and Gao Cong

work page
[43]

MAAdvisor: Zero-Shot Index Advisor using Multi-Agent LLMs.CoRR abs/2508.16044 (2025)

work page arXiv 2025
[44]

Zhaodonghui Li, Haitao Yuan, Huiming Wang, Gao Cong, and Lidong Bing

work page
[45]

VLDB Endow.18, 1 (Sept

LLM-R2: A Large Language Model Enhanced Rule-Based Rewrite System for Boosting Query Efficiency.Proc. VLDB Endow.18, 1 (Sept. 2024), 53–65. https://doi.org/10.14778/3696435.3696440

work page doi:10.14778/3696435.3696440 2024
[46]

Wan Shen Lim, Lin Ma, William Zhang, Matthew Butrovich, Samuel Arch, and Andrew Pavlo. 2024. Hit the gym: accelerating query execution to efficiently bootstrap behavior models for self-driving database management systems.Pro- ceedings of the VLDB Endowment17, 11 (2024), 3680–3693

work page 2024
[47]

Jie Liu and Barzan Mozafari. 2024. GenRewrite: Query Rewriting via Large Language Models.arXiv preprint arXiv:2403.09060(2024)

work page arXiv 2024
[48]

Guy Lohman. [n.d.]. Is Query Optimization a “Solved” Problem? http://wp. sigmod.org/?p=1075

work page
[49]

Lin Ma, Bailu Ding, Sudipto Das, and Adith Swaminathan. 2020. Active Learning for ML Enhanced Database Systems. InProceedings of the 2020 International Conference on Management of Data, SIGMOD Conference 2020, online conference [Portland, OR, USA], June 14-19, 2020, David Maier, Rachel Pottinger, AnHai Doan, Wang-Chiew Tan, Abdussalam Alawini, and Hung Q....

work page doi:10.1145/3318464.3389768 2020
[50]

Xinbei Ma, Yeyun Gong, Pengcheng He, Hai Zhao, and Nan Duan. 2023. Query Rewriting in Retrieval-Augmented Large Language Models. InProceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, Houda Bouamor, Juan Pino, and Kalika Bali (Eds.). Association for Computational Linguistics, Singapore, 5303–5315. https://doi.org/10.1865...

work page doi:10.18653/v1/2023.emnlp- 2023
[51]

Amir M. Mansourian, Rozhan Ahmadi, Masoud Ghafouri, Amir Mohammad Babaei, Elaheh Badali Golezani, Zeynab yasamani ghamchi, Vida Ramezanian, Alireza Taherian, Kimia Dinashi, Amirali Miri, and Shohreh Kasaei. 2025. A Comprehensive Survey on Knowledge Distillation.Transactions on Machine Learning Research(2025). https://openreview.net/forum?id=3cbJzdR78B

work page 2025
[52]

Ryan Marcus and Olga Papaemmanouil. 2019. Plan-Structured Deep Neural Network Models for Query Performance Prediction.Proc. VLDB Endow.12, 11 (2019), 1733–1746. https://doi.org/10.14778/3342263.3342646

work page doi:10.14778/3342263.3342646 2019
[53]

Microsoft. 2026. Azure Virtual Machines. https://azure.microsoft.com/en-us/ products/virtual-machines

work page 2026
[54]

Narasayya and Surajit Chaudhuri

Vivek R. Narasayya and Surajit Chaudhuri. 2026. Leveraging Query Opti- mizers to Verify the Soundness of LLM-based Query Rewrites for Real-World Workloads, and More. In16th Conference on Innovative Data Systems Research, CIDR 2026, Chaminade, CA, USA, January 18-21, 2026. www.cidrdb.org. https://vldb.org/cidrdb/2026/leveraging-query-optimizers-to-verify-t...

work page 2026
[55]

OpenAI. 2025. Gpt-5 system card. https://cdn.openai.com/gpt-5-system-card. pdf

work page 2025
[56]

Manning, Ste- fano Ermon, and Chelsea Finn

Rafael Rafailov, Archit Sharma, Eric Mitchell, Christopher D. Manning, Ste- fano Ermon, and Chelsea Finn. 2023. Direct Preference Optimization: Your 13 Language Model is Secretly a Reward Model. InAdvances in Neural Infor- mation Processing Systems 36: Annual Conference on Neural Information Pro- cessing Systems 2023, NeurIPS 2023, New Orleans, LA, USA, D...

work page 2023
[57]

Rainer Schlosser, Jan Kossmann, and Martin Boissier. 2019. Efficient Scalable Multi-attribute Index Selection Using Recursive Strategies. InICDE. 1238–1249

work page 2019
[58]

John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, and Oleg Klimov

work page
[59]

Proximal Policy Optimization Algorithms

Proximal Policy Optimization Algorithms.CoRRabs/1707.06347 (2017). arXiv:1707.06347 http://arxiv.org/abs/1707.06347

work page internal anchor Pith review Pith/arXiv arXiv 2017
[60]

Zhihong Shao, Peiyi Wang, Qihao Zhu, Runxin Xu, Junxiao Song, Mingchuan Zhang, Y. K. Li, Y. Wu, and Daya Guo. 2024. DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models.CoRRabs/2402.03300 (2024). https://doi.org/10.48550/ARXIV.2402.03300 arXiv:2402.03300

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2402.03300 2024
[61]

Chi, Nathanael Schärli, and Denny Zhou

Freda Shi, Xinyun Chen, Kanishka Misra, Nathan Scales, David Dohan, Ed H. Chi, Nathanael Schärli, and Denny Zhou. 2023. Large Language Models Can Be Easily Distracted by Irrelevant Context. InProceedings of the 40th International Conference on Machine Learning (Proceedings of Machine Learning Research), Andreas Krause, Emma Brunskill, Kyunghyun Cho, Barba...

work page 2023
[62]

Tarique Siddiqui and Wentao Wu. 2023. ML-Powered Index Tuning: An Overview of Recent Progress and Open Challenges.SIGMOD Rec.52, 4 (2023), 19–30

work page 2023
[63]

Narasayya, and Surajit Chaudhuri

Tarique Siddiqui, Wentao Wu, Vivek R. Narasayya, and Surajit Chaudhuri. 2022. DISTILL: Low-Overhead Data-Driven Techniques for Filtering and Costing In- dexes for Scalable Index Tuning.Proc. VLDB Endow.15, 10 (2022), 2019–2031

work page 2022
[64]

Zhaoyan Sun, Xuanhe Zhou, Guoliang Li, Xiang Yu, Jianhua Feng, and Yong Zhang. 2025. R-Bot: An LLM-based Query Rewrite System.Proc. VLDB Endow. 18, 12 (2025), 5031–5044

work page 2025
[65]

Jie Tan, Kangfei Zhao, Rui Li, Jeffrey Xu Yu, Chengzhi Piao, Hong Cheng, Helen Meng, Deli Zhao, and Yu Rong. 2025. Can Large Language Models Be Query Optimizer for Relational Databases?Proc. ACM Manag. Data3, 6 (2025), 1–28. https://doi.org/10.1145/3769771

work page doi:10.1145/3769771 2025
[66]

Immanuel Trummer. 2024. DB-BERT: making database tuning tools "read" the manual.VLDB J.33, 4 (2024), 1085–1104

work page 2024
[67]

Gary Valentin et al. 2000. DB2 Advisor: An Optimizer Smart Enough to Recom- mend Its Own Indexes. InICDE. 101–110

work page 2000
[68]

Junxiong Wang et al. 2021. UDO: Universal Database Optimization using Rein- forcement Learning.Proc. VLDB Endow.14, 13 (2021), 3402–3414

work page 2021
[69]

Xiaoying Wang, Changbo Qu, Weiyuan Wu, Jiannan Wang, and Qingqing Zhou

work page
[70]

VLDB Endow.14, 9 (2021), 1640–1654

Are We Ready For Learned Cardinality Estimation?Proc. VLDB Endow.14, 9 (2021), 1640–1654

work page 2021
[71]

Narasayya, and Surajit Chaudhuri

Xiaoying Wang, Wentao Wu, Vivek R. Narasayya, and Surajit Chaudhuri. 2025. Esc: An Early-Stopping Checker for Budget-aware Index Tuning.Proc. VLDB Endow.18, 5 (2025), 1278–1290

work page 2025
[72]

Narasayya, and Surajit Chaud- huri

Xiaoying Wang, Wentao Wu, Chi Wang, Vivek R. Narasayya, and Surajit Chaud- huri. 2024. Wii: Dynamic Budget Reallocation In Index Tuning.Proc. ACM Manag. Data2, 3 (2024), 182

work page 2024
[73]

Kyu-Young Whang. 1985. Index Selection in Relational Databases. InFoundations of Data Organization. 487–500

work page 1985
[74]

Wentao Wu. 2025. Hybrid Cost Modeling for Reducing Query Performance Regression in Index Tuning.IEEE Trans. Knowl. Data Eng.37, 1 (2025), 379–391

work page 2025
[75]

Naughton

Wentao Wu, Yun Chi, Shenghuo Zhu, Jun’ichi Tatemura, Hakan Hacigümüs, and Jeffrey F. Naughton. 2013. Predicting query execution time: Are optimizer cost models really unusable?. InICDE. 1081–1092

work page 2013
[76]

Narasayya, and Surajit Chaudhuri

Wentao Wu, Anshuman Dutt, Gaoxiang Xu, Vivek R. Narasayya, and Surajit Chaudhuri. 2025. Understanding and Detecting Query Performance Regression in Practical Index Tuning: [Experiments & Analysis].Proc. ACM Manag. Data3, 6 (2025), 1–26

work page 2025
[77]

Naughton, and Harneet Singh

Wentao Wu, Jeffrey F. Naughton, and Harneet Singh. 2016. Sampling-Based Query Re-Optimization. InSIGMOD. 1721–1736

work page 2016
[78]

Narasayya, Surajit Chaudhuri, and Philip A

Wentao Wu, Chi Wang, Tarique Siddiqui, Junxiong Wang, Vivek R. Narasayya, Surajit Chaudhuri, and Philip A. Bernstein. 2022. Budget-aware Index Tuning with Reinforcement Learning. InSIGMOD. 1528–1541

work page 2022
[79]

Valluri, and Mohamed Zaït

Ritwik Yadav, Satyanarayana R. Valluri, and Mohamed Zaït. 2023. AIM: A practical approach to automated index management for SQL databases. InICDE. 3349–3362

work page 2023
[80]

An Yang, Anfeng Li, Baosong Yang, Beichen Zhang, Binyuan Hui, Bo Zheng, Bowen Yu, Chang Gao, Chengen Huang, Chenxu Lv, et al. 2025. Qwen3 technical report.arXiv preprint arXiv:2505.09388(2025)

work page internal anchor Pith review Pith/arXiv arXiv 2025

Showing first 80 references.

[1] [1]

Microsoft SQL Server Missing Indexes

2025. Microsoft SQL Server Missing Indexes. https://learn.microsoft.com/en- us/sql/relational-databases/indexes/tune-nonclustered-missing-index- suggestions?view=sql-server-ver17

work page 2025

[2] [2]

Marathe, Vivek R

Sanjay Agrawal, Surajit Chaudhuri, Lubor Kollár, Arunprasad P. Marathe, Vivek R. Narasayya, and Manoj Syamala. 2004. Database Tuning Advisor for Microsoft SQL Server 2005. InVLDB. 1110–1121

work page 2004

[3] [3]

Narasayya

Sanjay Agrawal, Surajit Chaudhuri, and Vivek R. Narasayya. 2001. Materialized View and Index Selection Tool for Microsoft SQL Server 2000. InSIGMOD

work page 2001

[4] [4]

Dana Van Aken et al. 2021. An Inquiry into Machine Learning-based Automatic Configuration Tuning Services on Real-World Database Management Systems. Proc. VLDB Endow.14, 7 (2021), 1241–1253

work page 2021

[5] [5]

Gordon, and Bohan Zhang

Dana Van Aken, Andrew Pavlo, Geoffrey J. Gordon, and Bohan Zhang. 2017. Automatic Database Management System Tuning Through Large-scale Machine Learning. InSIGMOD. ACM, 1009–1024

work page 2017

[6] [6]

Peter Akioyamen, Zixuan Yi, and Ryan Marcus. 2024. The Unreasonable Effec- tiveness of LLMs for Query Optimization.CoRRabs/2411.02862 (2024)

work page arXiv 2024

[7] [7]

Matteo Brucato, Tarique Siddiqui, Wentao Wu, Vivek Narasayya, and Surajit Chaudhuri. 2024. Wred: Workload Reduction for Scalable Index Tuning.Proc. ACM Manag. Data2, 1, Article 50 (2024), 26 pages

work page 2024

[8] [8]

Nicolas Bruno and Surajit Chaudhuri. 2005. Automatic Physical Database Tuning: A Relaxation-based Approach. InSIGMOD. 227–238

work page 2005

[9] [9]

Stefano Cereda et al. 2021. CGPTuner: a Contextual Gaussian Process Bandit Ap- proach for the Automatic Tuning of IT Configurations Under Varying Workload Conditions.Proc. VLDB Endow.14, 8 (2021), 1401–1413

work page 2021

[10] [10]

Sunil Chakkappen et al. 2025. Automatic Indexing in Oracle.Proc. VLDB Endow. 18, 12 (2025), 4924–4937

work page 2025

[11] [11]

Narasayya

Surajit Chaudhuri, Ashish Kumar Gupta, and Vivek R. Narasayya. 2002. Com- pressing SQL workloads. InSIGMOD. 488–499

work page 2002

[12] [12]

Surajit Chaudhuri and Vivek Narasayya. 2020. Anytime Algorithm of Database Tuning Advisor for Microsoft SQL Server

work page 2020

[13] [13]

Narasayya

Surajit Chaudhuri and Vivek R. Narasayya. 1997. An Efficient Cost-Driven Index Selection Tool for Microsoft SQL Server. InVLDB. 146–155

work page 1997

[14] [14]

Narasayya

Surajit Chaudhuri and Vivek R. Narasayya. 1998. AutoAdmin ’What-if’ Index Analysis Utility. InSIGMOD. 367–378

work page 1998

[15] [15]

Narasayya

Surajit Chaudhuri and Vivek R. Narasayya. 1999. Index Merging. InICDE

work page 1999

[16] [16]

Sudipto Das et al. 2019. Automatically Indexing Millions of Databases in Mi- crosoft Azure SQL Database. InSIGMOD. 666–679

work page 2019

[17] [17]

Debabrata Dash, Neoklis Polyzotis, and Anastasia Ailamaki. 2011. CoPhy: A Scalable, Portable, and Interactive Index Advisor for Large Workloads.Proc. VLDB Endow.4, 6 (2011), 362–372

work page 2011

[18] [18]

Naughton, and Stratis Viglas

Shaleen Deep, Anja Gruenheid, Paraschos Koutris, Jeffrey F. Naughton, and Stratis Viglas. 2020. Comprehensive and Efficient Workload Compression.Proc. VLDB Endow.14, 3 (2020), 418–430

work page 2020

[19] [19]

Sriram Dharwada, Himanshu Devrani, Jayant Haritsa, and Harish Doraiswamy

work page

[20] [20]

Query rewriting via llms.arXiv preprint arXiv:2502.12918(2025)

work page arXiv 2025

[21] [21]

Narasayya

Bailu Ding, Sudipto Das, Ryan Marcus, Wentao Wu, Surajit Chaudhuri, and Vivek R. Narasayya. 2019. AI Meets AI: Leveraging Query Executions to Improve Index Recommendations. InSIGMOD. 1241–1258

work page 2019

[22] [22]

Narasayya

Bailu Ding, Sudipto Das, Wentao Wu, Surajit Chaudhuri, and Vivek R. Narasayya

work page

[23] [23]

VLDB Endow.11, 10 (2018), 1123–1136

Plan Stitch: Harnessing the Best of Many Plans.Proc. VLDB Endow.11, 10 (2018), 1123–1136

work page 2018

[24] [24]

Yufeng Du, Minyang Tian, Srikanth Ronanki, Subendhu Rongali, Sravan Babu Bodapati, Aram Galstyan, Azton Wells, Roy Schwartz, Eliu A Huerta, and Hao Peng. 2025. Context Length Alone Hurts LLM Performance Despite Perfect Retrieval. InFindings of the Association for Computational Linguistics: EMNLP 2025, Christos Christodoulopoulos, Tanmoy Chakraborty, Carol...

work page doi:10.18653/v1/2025.findings-emnlp.1264 2025

[25] [25]

Songyun Duan et al . 2009. Tuning Database Configuration Parameters with iTuned.Proc. VLDB Endow.2, 1 (2009), 1246–1257

work page 2009

[26] [26]

Victor Giannakouris and Immanuel Trummer. 2024. DBG-PT: A Large Language Model Assisted Query Performance Regression Debugger.Proc. VLDB Endow.17, 12 (2024), 4337–4340. https://www.vldb.org/pvldb/vol17/p4337-giannakouris.pdf

work page 2024

[27] [27]

Victor Giannakouris and Immanuel Trummer. 2025. 𝜆-Tune: Harnessing Large Language Models for Automated Database System Tuning.Proc. ACM Manag. Data3, 1 (2025), 2:1–2:26

work page 2025

[28] [28]

Goetz Graefe. 1995. The Cascades Framework for Query Optimization.IEEE Data Eng. Bull.18, 3 (1995), 19–29

work page 1995

[29] [29]

Yuxian Gu, Li Dong, Furu Wei, and Minlie Huang. 2024. MiniLLM: Knowledge Distillation of Large Language Models. InThe Twelfth International Conference on Learning Representations. https://openreview.net/forum?id=5h0qf7IBZZ

work page 2024

[30] [30]

Beliz Gunel, Jingfei Du, Alexis Conneau, and Veselin Stoyanov. 2021. Supervised Contrastive Learning for Pre-trained Language Model Fine-tuning. In9th Inter- national Conference on Learning Representations, ICLR 2021, Virtual Event, Austria, May 3-7, 2021. OpenReview.net. https://openreview.net/forum?id=cu7IUiOhujH

work page 2021

[31] [31]

Daya Guo, Dejian Yang, Haowei Zhang, Junxiao Song, Ruoyu Zhang, Runxin Xu, Qihao Zhu, Shirong Ma, Peiyi Wang, Xiao Bi, et al . 2025. Deepseek-r1: Incentivizing reasoning capability in llms via reinforcement learning.arXiv preprint arXiv:2501.12948(2025)

work page internal anchor Pith review Pith/arXiv arXiv 2025

[32] [32]

Aaron Hurst, Adam Lerer, Adam P Goucher, Adam Perelman, Aditya Ramesh, Aidan Clark, AJ Ostrow, Akila Welihinda, Alan Hayes, Alec Radford, et al. 2024. Gpt-4o system card.arXiv preprint arXiv:2410.21276(2024)

work page internal anchor Pith review Pith/arXiv arXiv 2024

[33] [33]

Ioannidis and Stavros Christodoulakis

Yannis E. Ioannidis and Stavros Christodoulakis. 1991. On the Propagation of Errors in the Size of Join Results. InSIGMOD. 268–277

work page 1991

[34] [34]

Kunal Jha, Wilka Carvalho, Yancheng Liang, Simon Shaolei Du, Max Kleiman- Weiner, and Natasha Jaques. 2025. Cross-environment Cooperation Enables Zero-shot Multi-agent Coordination. InICML

work page 2025

[35] [35]

Andrew Kane. 2017. Introducing Dexter, the Automatic Indexer for Post- gres. https://medium.com/@ankane/introducing-dexter-the-automatic-indexer- for-postgres-5f8fa8b28f27

work page 2017

[36] [36]

Konstantinos Kanellis et al. 2022. LlamaTune: Sample-Efficient DBMS Configu- ration Tuning.Proc. VLDB Endow.15, 11 (2022), 2953–2965

work page 2022

[37] [37]

Jan Kossmann, Stefan Halfpap, Marcel Jankrift, and Rainer Schlosser. 2020. Magic mirror in my hand, which is the best in the land? An Experimental Evaluation of Index Selection Algorithms.Proc. VLDB Endow.13, 11 (2020), 2382–2395

work page 2020

[38] [38]

Jiale Lao et al. 2025. GPTuner: An LLM-Based Database Tuning System.SIGMOD Rec.54, 1 (2025), 101–110

work page 2025

[39] [39]

Viktor Leis et al. 2015. How Good Are Query Optimizers, Really?PVLDB9, 3 (2015), 204–215

work page 2015

[40] [40]

Ao Li, Yuexiang Xie, Songze Li, Fugee Tsung, Bolin Ding, and Yaliang Li. 2025. Agent-Oriented Planning in Multi-Agent Systems. InThe Thirteenth International Conference on Learning Representations, ICLR 2025, Singapore, April 24-28, 2025. OpenReview.net. https://openreview.net/forum?id=EqcLAU6gyU

work page 2025

[41] [41]

Guoliang Li, Xuanhe Zhou, Shifu Li, and Bo Gao. 2019. QTune: A Query-Aware Database Tuning System with Deep Reinforcement Learning.Proc. VLDB Endow. 12, 12 (2019), 2118–2130

work page 2019

[42] [42]

Zhaodonghui Li, Haitao Yuan, Jiachen Shi, Hao Zhang, Yu Rong, and Gao Cong

work page

[43] [43]

MAAdvisor: Zero-Shot Index Advisor using Multi-Agent LLMs.CoRR abs/2508.16044 (2025)

work page arXiv 2025

[44] [44]

Zhaodonghui Li, Haitao Yuan, Huiming Wang, Gao Cong, and Lidong Bing

work page

[45] [45]

VLDB Endow.18, 1 (Sept

LLM-R2: A Large Language Model Enhanced Rule-Based Rewrite System for Boosting Query Efficiency.Proc. VLDB Endow.18, 1 (Sept. 2024), 53–65. https://doi.org/10.14778/3696435.3696440

work page doi:10.14778/3696435.3696440 2024

[46] [46]

Wan Shen Lim, Lin Ma, William Zhang, Matthew Butrovich, Samuel Arch, and Andrew Pavlo. 2024. Hit the gym: accelerating query execution to efficiently bootstrap behavior models for self-driving database management systems.Pro- ceedings of the VLDB Endowment17, 11 (2024), 3680–3693

work page 2024

[47] [47]

Jie Liu and Barzan Mozafari. 2024. GenRewrite: Query Rewriting via Large Language Models.arXiv preprint arXiv:2403.09060(2024)

work page arXiv 2024

[48] [48]

Guy Lohman. [n.d.]. Is Query Optimization a “Solved” Problem? http://wp. sigmod.org/?p=1075

work page

[49] [49]

Lin Ma, Bailu Ding, Sudipto Das, and Adith Swaminathan. 2020. Active Learning for ML Enhanced Database Systems. InProceedings of the 2020 International Conference on Management of Data, SIGMOD Conference 2020, online conference [Portland, OR, USA], June 14-19, 2020, David Maier, Rachel Pottinger, AnHai Doan, Wang-Chiew Tan, Abdussalam Alawini, and Hung Q....

work page doi:10.1145/3318464.3389768 2020

[50] [50]

Xinbei Ma, Yeyun Gong, Pengcheng He, Hai Zhao, and Nan Duan. 2023. Query Rewriting in Retrieval-Augmented Large Language Models. InProceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, Houda Bouamor, Juan Pino, and Kalika Bali (Eds.). Association for Computational Linguistics, Singapore, 5303–5315. https://doi.org/10.1865...

work page doi:10.18653/v1/2023.emnlp- 2023

[51] [51]

Amir M. Mansourian, Rozhan Ahmadi, Masoud Ghafouri, Amir Mohammad Babaei, Elaheh Badali Golezani, Zeynab yasamani ghamchi, Vida Ramezanian, Alireza Taherian, Kimia Dinashi, Amirali Miri, and Shohreh Kasaei. 2025. A Comprehensive Survey on Knowledge Distillation.Transactions on Machine Learning Research(2025). https://openreview.net/forum?id=3cbJzdR78B

work page 2025

[52] [52]

Ryan Marcus and Olga Papaemmanouil. 2019. Plan-Structured Deep Neural Network Models for Query Performance Prediction.Proc. VLDB Endow.12, 11 (2019), 1733–1746. https://doi.org/10.14778/3342263.3342646

work page doi:10.14778/3342263.3342646 2019

[53] [53]

Microsoft. 2026. Azure Virtual Machines. https://azure.microsoft.com/en-us/ products/virtual-machines

work page 2026

[54] [54]

Narasayya and Surajit Chaudhuri

Vivek R. Narasayya and Surajit Chaudhuri. 2026. Leveraging Query Opti- mizers to Verify the Soundness of LLM-based Query Rewrites for Real-World Workloads, and More. In16th Conference on Innovative Data Systems Research, CIDR 2026, Chaminade, CA, USA, January 18-21, 2026. www.cidrdb.org. https://vldb.org/cidrdb/2026/leveraging-query-optimizers-to-verify-t...

work page 2026

[55] [55]

OpenAI. 2025. Gpt-5 system card. https://cdn.openai.com/gpt-5-system-card. pdf

work page 2025

[56] [56]

Manning, Ste- fano Ermon, and Chelsea Finn

Rafael Rafailov, Archit Sharma, Eric Mitchell, Christopher D. Manning, Ste- fano Ermon, and Chelsea Finn. 2023. Direct Preference Optimization: Your 13 Language Model is Secretly a Reward Model. InAdvances in Neural Infor- mation Processing Systems 36: Annual Conference on Neural Information Pro- cessing Systems 2023, NeurIPS 2023, New Orleans, LA, USA, D...

work page 2023

[57] [57]

Rainer Schlosser, Jan Kossmann, and Martin Boissier. 2019. Efficient Scalable Multi-attribute Index Selection Using Recursive Strategies. InICDE. 1238–1249

work page 2019

[58] [58]

John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, and Oleg Klimov

work page

[59] [59]

Proximal Policy Optimization Algorithms

Proximal Policy Optimization Algorithms.CoRRabs/1707.06347 (2017). arXiv:1707.06347 http://arxiv.org/abs/1707.06347

work page internal anchor Pith review Pith/arXiv arXiv 2017

[60] [60]

Zhihong Shao, Peiyi Wang, Qihao Zhu, Runxin Xu, Junxiao Song, Mingchuan Zhang, Y. K. Li, Y. Wu, and Daya Guo. 2024. DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models.CoRRabs/2402.03300 (2024). https://doi.org/10.48550/ARXIV.2402.03300 arXiv:2402.03300

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2402.03300 2024

[61] [61]

Chi, Nathanael Schärli, and Denny Zhou

Freda Shi, Xinyun Chen, Kanishka Misra, Nathan Scales, David Dohan, Ed H. Chi, Nathanael Schärli, and Denny Zhou. 2023. Large Language Models Can Be Easily Distracted by Irrelevant Context. InProceedings of the 40th International Conference on Machine Learning (Proceedings of Machine Learning Research), Andreas Krause, Emma Brunskill, Kyunghyun Cho, Barba...

work page 2023

[62] [62]

Tarique Siddiqui and Wentao Wu. 2023. ML-Powered Index Tuning: An Overview of Recent Progress and Open Challenges.SIGMOD Rec.52, 4 (2023), 19–30

work page 2023

[63] [63]

Narasayya, and Surajit Chaudhuri

Tarique Siddiqui, Wentao Wu, Vivek R. Narasayya, and Surajit Chaudhuri. 2022. DISTILL: Low-Overhead Data-Driven Techniques for Filtering and Costing In- dexes for Scalable Index Tuning.Proc. VLDB Endow.15, 10 (2022), 2019–2031

work page 2022

[64] [64]

Zhaoyan Sun, Xuanhe Zhou, Guoliang Li, Xiang Yu, Jianhua Feng, and Yong Zhang. 2025. R-Bot: An LLM-based Query Rewrite System.Proc. VLDB Endow. 18, 12 (2025), 5031–5044

work page 2025

[65] [65]

Jie Tan, Kangfei Zhao, Rui Li, Jeffrey Xu Yu, Chengzhi Piao, Hong Cheng, Helen Meng, Deli Zhao, and Yu Rong. 2025. Can Large Language Models Be Query Optimizer for Relational Databases?Proc. ACM Manag. Data3, 6 (2025), 1–28. https://doi.org/10.1145/3769771

work page doi:10.1145/3769771 2025

[66] [66]

Immanuel Trummer. 2024. DB-BERT: making database tuning tools "read" the manual.VLDB J.33, 4 (2024), 1085–1104

work page 2024

[67] [67]

Gary Valentin et al. 2000. DB2 Advisor: An Optimizer Smart Enough to Recom- mend Its Own Indexes. InICDE. 101–110

work page 2000

[68] [68]

Junxiong Wang et al. 2021. UDO: Universal Database Optimization using Rein- forcement Learning.Proc. VLDB Endow.14, 13 (2021), 3402–3414

work page 2021

[69] [69]

Xiaoying Wang, Changbo Qu, Weiyuan Wu, Jiannan Wang, and Qingqing Zhou

work page

[70] [70]

VLDB Endow.14, 9 (2021), 1640–1654

Are We Ready For Learned Cardinality Estimation?Proc. VLDB Endow.14, 9 (2021), 1640–1654

work page 2021

[71] [71]

Narasayya, and Surajit Chaudhuri

Xiaoying Wang, Wentao Wu, Vivek R. Narasayya, and Surajit Chaudhuri. 2025. Esc: An Early-Stopping Checker for Budget-aware Index Tuning.Proc. VLDB Endow.18, 5 (2025), 1278–1290

work page 2025

[72] [72]

Narasayya, and Surajit Chaud- huri

Xiaoying Wang, Wentao Wu, Chi Wang, Vivek R. Narasayya, and Surajit Chaud- huri. 2024. Wii: Dynamic Budget Reallocation In Index Tuning.Proc. ACM Manag. Data2, 3 (2024), 182

work page 2024

[73] [73]

Kyu-Young Whang. 1985. Index Selection in Relational Databases. InFoundations of Data Organization. 487–500

work page 1985

[74] [74]

Wentao Wu. 2025. Hybrid Cost Modeling for Reducing Query Performance Regression in Index Tuning.IEEE Trans. Knowl. Data Eng.37, 1 (2025), 379–391

work page 2025

[75] [75]

Naughton

Wentao Wu, Yun Chi, Shenghuo Zhu, Jun’ichi Tatemura, Hakan Hacigümüs, and Jeffrey F. Naughton. 2013. Predicting query execution time: Are optimizer cost models really unusable?. InICDE. 1081–1092

work page 2013

[76] [76]

Narasayya, and Surajit Chaudhuri

Wentao Wu, Anshuman Dutt, Gaoxiang Xu, Vivek R. Narasayya, and Surajit Chaudhuri. 2025. Understanding and Detecting Query Performance Regression in Practical Index Tuning: [Experiments & Analysis].Proc. ACM Manag. Data3, 6 (2025), 1–26

work page 2025

[77] [77]

Naughton, and Harneet Singh

Wentao Wu, Jeffrey F. Naughton, and Harneet Singh. 2016. Sampling-Based Query Re-Optimization. InSIGMOD. 1721–1736

work page 2016

[78] [78]

Narasayya, Surajit Chaudhuri, and Philip A

Wentao Wu, Chi Wang, Tarique Siddiqui, Junxiong Wang, Vivek R. Narasayya, Surajit Chaudhuri, and Philip A. Bernstein. 2022. Budget-aware Index Tuning with Reinforcement Learning. InSIGMOD. 1528–1541

work page 2022

[79] [79]

Valluri, and Mohamed Zaït

Ritwik Yadav, Satyanarayana R. Valluri, and Mohamed Zaït. 2023. AIM: A practical approach to automated index management for SQL databases. InICDE. 3349–3362

work page 2023

[80] [80]

An Yang, Anfeng Li, Baosong Yang, Beichen Zhang, Binyuan Hui, Bo Zheng, Bowen Yu, Chang Gao, Chengen Huang, Chenxu Lv, et al. 2025. Qwen3 technical report.arXiv preprint arXiv:2505.09388(2025)

work page internal anchor Pith review Pith/arXiv arXiv 2025