Evaluating the Practical Effectiveness of LLM-Driven Index Tuning with Microsoft Database Tuning Advisor
Pith reviewed 2026-05-15 13:54 UTC · model grok-4.3
The pith
LLM-driven index tuning can find configurations that significantly outperform DTA in execution time for a considerable number of cases, though DTA is generally more reliable.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Although DTA is generally more reliable, with a few invocations, LLM can identify configurations that significantly outperform those found by DTA in execution time in a considerable number of cases, highlighting its potential as a complementary technique. We also observe that LLM's reasoning captures human-intuitive insights that may be distilled to potentially improve DTA. However, adopting LLM-driven index tuning in production remains challenging due to its substantial performance variance, limited and often negative impact when directly integrated into DTA, and the high cost of performance validation.
What carries the argument
Comparison of index configurations from DTA's what-if API cost estimates versus LLM suggestions, validated through actual query execution times on benchmarks and customer workloads.
Load-bearing premise
The tested benchmarks and real-world customer workloads are representative, and measured execution time improvements accurately reflect production benefits without unaccounted confounding factors like hardware variation.
What would settle it
Repeating the experiments on a fresh collection of customer workloads and finding that LLM fails to produce outperforming configurations in a comparable fraction of cases would falsify the claim.
Figures
read the original abstract
Index tuning is critical for the performance of modern database systems. Industrial index tuners, such as the Database Tuning Advisor (DTA) developed for Microsoft SQL Server, rely on the "what-if" API provided by the query optimizer to estimate the cost of a query given an index configuration, which can lead to suboptimal recommendations when the estimations are inaccurate. Large language model (LLM) offers a new approach to index tuning, with knowledge learned from web-scale training datasets. However, the effectiveness of LLM-driven index tuning, especially beyond what is already achieved by commercial index tuners, remains unclear. In this paper, we study the practical effectiveness of LLM-driven index tuning using both industrial benchmarks and real-world enterprise customer workloads, and compare it with DTA. Our results show that although DTA is generally more reliable, with a few invocations, LLM can identify configurations that significantly outperform those found by DTA in execution time in a considerable number of cases, highlighting its potential as a complementary technique. We also observe that LLM's reasoning captures human-intuitive insights that may be distilled to potentially improve DTA. However, adopting LLM-driven index tuning in production remains challenging due to its substantial performance variance, limited and often negative impact when directly integrated into DTA, and the high cost of performance validation. This work provides motivation, lessons, and practical insights that will inspire future work on LLM-driven index tuning both in academia and industry.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper evaluates the practical effectiveness of LLM-driven index tuning compared to Microsoft's Database Tuning Advisor (DTA) using industrial benchmarks and real-world enterprise customer workloads. It claims that while DTA is generally more reliable, LLMs can with a few invocations identify index configurations that significantly outperform DTA in execution time in a considerable number of cases; LLM reasoning also captures human-intuitive insights that could improve DTA, though production adoption is challenged by performance variance, limited integration benefits, and high validation costs.
Significance. If the results hold under rigorous controls, the work is significant for providing empirical evidence on LLM-based index tuning as a potential complement to commercial tools like DTA. The inclusion of real-world customer workloads is a notable strength, offering practical lessons on variance and adoption barriers that could guide hybrid tuning systems in database research.
major comments (2)
- [§4] §4 (Experimental Evaluation): The comparative results on execution-time outperformance lack any description of measurement protocols, including whether runs were repeated, cold-cache conditions were enforced, statistical significance was tested, error bars reported, or controls applied for buffer-pool state, concurrent load, and hardware variation. This directly undermines the central claim that LLM configurations significantly outperform DTA, as unaccounted noise could explain the reported wins.
- [§5] §5 (Discussion of Insights): The assertion that LLM reasoning captures human-intuitive insights lacks concrete examples from the workloads or a quantified analysis of how these could be distilled to improve DTA, rendering the complementarity argument anecdotal rather than evidence-based.
minor comments (1)
- [Abstract] Abstract: The term 'considerable number of cases' is imprecise; stating the exact fraction or count of workloads where LLM outperforms DTA would strengthen clarity without altering the narrative.
Simulated Author's Rebuttal
We thank the referee for the thoughtful and constructive review. The comments highlight important areas for improving the clarity and rigor of our experimental and discussion sections. We address each major comment below and will revise the manuscript to incorporate the suggested enhancements.
read point-by-point responses
-
Referee: [§4] §4 (Experimental Evaluation): The comparative results on execution-time outperformance lack any description of measurement protocols, including whether runs were repeated, cold-cache conditions were enforced, statistical significance was tested, error bars reported, or controls applied for buffer-pool state, concurrent load, and hardware variation. This directly undermines the central claim that LLM configurations significantly outperform DTA, as unaccounted noise could explain the reported wins.
Authors: We agree that the original manuscript did not provide sufficient detail on the measurement protocols, which is a valid concern for validating the execution-time comparisons. In the revised version, we will expand the experimental setup subsection in §4 to explicitly describe: (1) each configuration was executed five times with the buffer pool flushed between runs to enforce cold-cache conditions; (2) statistical significance was evaluated using paired t-tests (p < 0.05); (3) error bars in figures represent one standard deviation; and (4) controls included dedicated hardware with no concurrent workloads and fixed server configurations to minimize variation. These additions will strengthen the reliability of the outperformance claims without altering the reported results. revision: yes
-
Referee: [§5] §5 (Discussion of Insights): The assertion that LLM reasoning captures human-intuitive insights lacks concrete examples from the workloads or a quantified analysis of how these could be distilled to improve DTA, rendering the complementarity argument anecdotal rather than evidence-based.
Authors: We acknowledge that the discussion in §5 would benefit from greater specificity. In the revision, we will include concrete examples drawn from the enterprise customer workloads, such as cases where the LLM recommended covering indexes for multi-column join patterns that aligned with common DBA practices but were not selected by DTA's cost model. However, a full quantified analysis of distilling these insights into modifications for DTA would require new experiments and implementation work that extends beyond the current study's scope; we will explicitly note this limitation and position it as an avenue for future research to make the complementarity argument more rigorous. revision: partial
Circularity Check
No circularity: empirical evaluation relies on external benchmarks and workloads
full rationale
This paper is a pure empirical study comparing LLM index tuning against DTA on industrial benchmarks and real customer workloads. No mathematical derivations, fitted parameters renamed as predictions, or load-bearing self-citations appear in the derivation chain. All claims rest on direct experimental measurements against external data sources, with no reduction of outputs to inputs by construction.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Microsoft SQL Server Missing Indexes
2025. Microsoft SQL Server Missing Indexes. https://learn.microsoft.com/en- us/sql/relational-databases/indexes/tune-nonclustered-missing-index- suggestions?view=sql-server-ver17
work page 2025
-
[2]
Sanjay Agrawal, Surajit Chaudhuri, Lubor Kollár, Arunprasad P. Marathe, Vivek R. Narasayya, and Manoj Syamala. 2004. Database Tuning Advisor for Microsoft SQL Server 2005. InVLDB. 1110–1121
work page 2004
- [3]
-
[4]
Dana Van Aken et al. 2021. An Inquiry into Machine Learning-based Automatic Configuration Tuning Services on Real-World Database Management Systems. Proc. VLDB Endow.14, 7 (2021), 1241–1253
work page 2021
-
[5]
Dana Van Aken, Andrew Pavlo, Geoffrey J. Gordon, and Bohan Zhang. 2017. Automatic Database Management System Tuning Through Large-scale Machine Learning. InSIGMOD. ACM, 1009–1024
work page 2017
- [6]
-
[7]
Matteo Brucato, Tarique Siddiqui, Wentao Wu, Vivek Narasayya, and Surajit Chaudhuri. 2024. Wred: Workload Reduction for Scalable Index Tuning.Proc. ACM Manag. Data2, 1, Article 50 (2024), 26 pages
work page 2024
-
[8]
Nicolas Bruno and Surajit Chaudhuri. 2005. Automatic Physical Database Tuning: A Relaxation-based Approach. InSIGMOD. 227–238
work page 2005
-
[9]
Stefano Cereda et al. 2021. CGPTuner: a Contextual Gaussian Process Bandit Ap- proach for the Automatic Tuning of IT Configurations Under Varying Workload Conditions.Proc. VLDB Endow.14, 8 (2021), 1401–1413
work page 2021
-
[10]
Sunil Chakkappen et al. 2025. Automatic Indexing in Oracle.Proc. VLDB Endow. 18, 12 (2025), 4924–4937
work page 2025
- [11]
-
[12]
Surajit Chaudhuri and Vivek Narasayya. 2020. Anytime Algorithm of Database Tuning Advisor for Microsoft SQL Server
work page 2020
- [13]
- [14]
- [15]
-
[16]
Sudipto Das et al. 2019. Automatically Indexing Millions of Databases in Mi- crosoft Azure SQL Database. InSIGMOD. 666–679
work page 2019
-
[17]
Debabrata Dash, Neoklis Polyzotis, and Anastasia Ailamaki. 2011. CoPhy: A Scalable, Portable, and Interactive Index Advisor for Large Workloads.Proc. VLDB Endow.4, 6 (2011), 362–372
work page 2011
-
[18]
Shaleen Deep, Anja Gruenheid, Paraschos Koutris, Jeffrey F. Naughton, and Stratis Viglas. 2020. Comprehensive and Efficient Workload Compression.Proc. VLDB Endow.14, 3 (2020), 418–430
work page 2020
-
[19]
Sriram Dharwada, Himanshu Devrani, Jayant Haritsa, and Harish Doraiswamy
- [20]
- [21]
- [22]
-
[23]
VLDB Endow.11, 10 (2018), 1123–1136
Plan Stitch: Harnessing the Best of Many Plans.Proc. VLDB Endow.11, 10 (2018), 1123–1136
work page 2018
-
[24]
Yufeng Du, Minyang Tian, Srikanth Ronanki, Subendhu Rongali, Sravan Babu Bodapati, Aram Galstyan, Azton Wells, Roy Schwartz, Eliu A Huerta, and Hao Peng. 2025. Context Length Alone Hurts LLM Performance Despite Perfect Retrieval. InFindings of the Association for Computational Linguistics: EMNLP 2025, Christos Christodoulopoulos, Tanmoy Chakraborty, Carol...
-
[25]
Songyun Duan et al . 2009. Tuning Database Configuration Parameters with iTuned.Proc. VLDB Endow.2, 1 (2009), 1246–1257
work page 2009
-
[26]
Victor Giannakouris and Immanuel Trummer. 2024. DBG-PT: A Large Language Model Assisted Query Performance Regression Debugger.Proc. VLDB Endow.17, 12 (2024), 4337–4340. https://www.vldb.org/pvldb/vol17/p4337-giannakouris.pdf
work page 2024
-
[27]
Victor Giannakouris and Immanuel Trummer. 2025. 𝜆-Tune: Harnessing Large Language Models for Automated Database System Tuning.Proc. ACM Manag. Data3, 1 (2025), 2:1–2:26
work page 2025
-
[28]
Goetz Graefe. 1995. The Cascades Framework for Query Optimization.IEEE Data Eng. Bull.18, 3 (1995), 19–29
work page 1995
-
[29]
Yuxian Gu, Li Dong, Furu Wei, and Minlie Huang. 2024. MiniLLM: Knowledge Distillation of Large Language Models. InThe Twelfth International Conference on Learning Representations. https://openreview.net/forum?id=5h0qf7IBZZ
work page 2024
-
[30]
Beliz Gunel, Jingfei Du, Alexis Conneau, and Veselin Stoyanov. 2021. Supervised Contrastive Learning for Pre-trained Language Model Fine-tuning. In9th Inter- national Conference on Learning Representations, ICLR 2021, Virtual Event, Austria, May 3-7, 2021. OpenReview.net. https://openreview.net/forum?id=cu7IUiOhujH
work page 2021
-
[31]
Daya Guo, Dejian Yang, Haowei Zhang, Junxiao Song, Ruoyu Zhang, Runxin Xu, Qihao Zhu, Shirong Ma, Peiyi Wang, Xiao Bi, et al . 2025. Deepseek-r1: Incentivizing reasoning capability in llms via reinforcement learning.arXiv preprint arXiv:2501.12948(2025)
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[32]
Aaron Hurst, Adam Lerer, Adam P Goucher, Adam Perelman, Aditya Ramesh, Aidan Clark, AJ Ostrow, Akila Welihinda, Alan Hayes, Alec Radford, et al. 2024. Gpt-4o system card.arXiv preprint arXiv:2410.21276(2024)
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[33]
Ioannidis and Stavros Christodoulakis
Yannis E. Ioannidis and Stavros Christodoulakis. 1991. On the Propagation of Errors in the Size of Join Results. InSIGMOD. 268–277
work page 1991
-
[34]
Kunal Jha, Wilka Carvalho, Yancheng Liang, Simon Shaolei Du, Max Kleiman- Weiner, and Natasha Jaques. 2025. Cross-environment Cooperation Enables Zero-shot Multi-agent Coordination. InICML
work page 2025
-
[35]
Andrew Kane. 2017. Introducing Dexter, the Automatic Indexer for Post- gres. https://medium.com/@ankane/introducing-dexter-the-automatic-indexer- for-postgres-5f8fa8b28f27
work page 2017
-
[36]
Konstantinos Kanellis et al. 2022. LlamaTune: Sample-Efficient DBMS Configu- ration Tuning.Proc. VLDB Endow.15, 11 (2022), 2953–2965
work page 2022
-
[37]
Jan Kossmann, Stefan Halfpap, Marcel Jankrift, and Rainer Schlosser. 2020. Magic mirror in my hand, which is the best in the land? An Experimental Evaluation of Index Selection Algorithms.Proc. VLDB Endow.13, 11 (2020), 2382–2395
work page 2020
-
[38]
Jiale Lao et al. 2025. GPTuner: An LLM-Based Database Tuning System.SIGMOD Rec.54, 1 (2025), 101–110
work page 2025
-
[39]
Viktor Leis et al. 2015. How Good Are Query Optimizers, Really?PVLDB9, 3 (2015), 204–215
work page 2015
-
[40]
Ao Li, Yuexiang Xie, Songze Li, Fugee Tsung, Bolin Ding, and Yaliang Li. 2025. Agent-Oriented Planning in Multi-Agent Systems. InThe Thirteenth International Conference on Learning Representations, ICLR 2025, Singapore, April 24-28, 2025. OpenReview.net. https://openreview.net/forum?id=EqcLAU6gyU
work page 2025
-
[41]
Guoliang Li, Xuanhe Zhou, Shifu Li, and Bo Gao. 2019. QTune: A Query-Aware Database Tuning System with Deep Reinforcement Learning.Proc. VLDB Endow. 12, 12 (2019), 2118–2130
work page 2019
-
[42]
Zhaodonghui Li, Haitao Yuan, Jiachen Shi, Hao Zhang, Yu Rong, and Gao Cong
- [43]
-
[44]
Zhaodonghui Li, Haitao Yuan, Huiming Wang, Gao Cong, and Lidong Bing
-
[45]
LLM-R2: A Large Language Model Enhanced Rule-Based Rewrite System for Boosting Query Efficiency.Proc. VLDB Endow.18, 1 (Sept. 2024), 53–65. https://doi.org/10.14778/3696435.3696440
-
[46]
Wan Shen Lim, Lin Ma, William Zhang, Matthew Butrovich, Samuel Arch, and Andrew Pavlo. 2024. Hit the gym: accelerating query execution to efficiently bootstrap behavior models for self-driving database management systems.Pro- ceedings of the VLDB Endowment17, 11 (2024), 3680–3693
work page 2024
- [47]
-
[48]
Guy Lohman. [n.d.]. Is Query Optimization a “Solved” Problem? http://wp. sigmod.org/?p=1075
-
[49]
Lin Ma, Bailu Ding, Sudipto Das, and Adith Swaminathan. 2020. Active Learning for ML Enhanced Database Systems. InProceedings of the 2020 International Conference on Management of Data, SIGMOD Conference 2020, online conference [Portland, OR, USA], June 14-19, 2020, David Maier, Rachel Pottinger, AnHai Doan, Wang-Chiew Tan, Abdussalam Alawini, and Hung Q....
-
[50]
Xinbei Ma, Yeyun Gong, Pengcheng He, Hai Zhao, and Nan Duan. 2023. Query Rewriting in Retrieval-Augmented Large Language Models. InProceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, Houda Bouamor, Juan Pino, and Kalika Bali (Eds.). Association for Computational Linguistics, Singapore, 5303–5315. https://doi.org/10.1865...
-
[51]
Amir M. Mansourian, Rozhan Ahmadi, Masoud Ghafouri, Amir Mohammad Babaei, Elaheh Badali Golezani, Zeynab yasamani ghamchi, Vida Ramezanian, Alireza Taherian, Kimia Dinashi, Amirali Miri, and Shohreh Kasaei. 2025. A Comprehensive Survey on Knowledge Distillation.Transactions on Machine Learning Research(2025). https://openreview.net/forum?id=3cbJzdR78B
work page 2025
-
[52]
Ryan Marcus and Olga Papaemmanouil. 2019. Plan-Structured Deep Neural Network Models for Query Performance Prediction.Proc. VLDB Endow.12, 11 (2019), 1733–1746. https://doi.org/10.14778/3342263.3342646
-
[53]
Microsoft. 2026. Azure Virtual Machines. https://azure.microsoft.com/en-us/ products/virtual-machines
work page 2026
-
[54]
Narasayya and Surajit Chaudhuri
Vivek R. Narasayya and Surajit Chaudhuri. 2026. Leveraging Query Opti- mizers to Verify the Soundness of LLM-based Query Rewrites for Real-World Workloads, and More. In16th Conference on Innovative Data Systems Research, CIDR 2026, Chaminade, CA, USA, January 18-21, 2026. www.cidrdb.org. https://vldb.org/cidrdb/2026/leveraging-query-optimizers-to-verify-t...
work page 2026
-
[55]
OpenAI. 2025. Gpt-5 system card. https://cdn.openai.com/gpt-5-system-card. pdf
work page 2025
-
[56]
Manning, Ste- fano Ermon, and Chelsea Finn
Rafael Rafailov, Archit Sharma, Eric Mitchell, Christopher D. Manning, Ste- fano Ermon, and Chelsea Finn. 2023. Direct Preference Optimization: Your 13 Language Model is Secretly a Reward Model. InAdvances in Neural Infor- mation Processing Systems 36: Annual Conference on Neural Information Pro- cessing Systems 2023, NeurIPS 2023, New Orleans, LA, USA, D...
work page 2023
-
[57]
Rainer Schlosser, Jan Kossmann, and Martin Boissier. 2019. Efficient Scalable Multi-attribute Index Selection Using Recursive Strategies. InICDE. 1238–1249
work page 2019
-
[58]
John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, and Oleg Klimov
-
[59]
Proximal Policy Optimization Algorithms
Proximal Policy Optimization Algorithms.CoRRabs/1707.06347 (2017). arXiv:1707.06347 http://arxiv.org/abs/1707.06347
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[60]
Zhihong Shao, Peiyi Wang, Qihao Zhu, Runxin Xu, Junxiao Song, Mingchuan Zhang, Y. K. Li, Y. Wu, and Daya Guo. 2024. DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models.CoRRabs/2402.03300 (2024). https://doi.org/10.48550/ARXIV.2402.03300 arXiv:2402.03300
work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2402.03300 2024
-
[61]
Chi, Nathanael Schärli, and Denny Zhou
Freda Shi, Xinyun Chen, Kanishka Misra, Nathan Scales, David Dohan, Ed H. Chi, Nathanael Schärli, and Denny Zhou. 2023. Large Language Models Can Be Easily Distracted by Irrelevant Context. InProceedings of the 40th International Conference on Machine Learning (Proceedings of Machine Learning Research), Andreas Krause, Emma Brunskill, Kyunghyun Cho, Barba...
work page 2023
-
[62]
Tarique Siddiqui and Wentao Wu. 2023. ML-Powered Index Tuning: An Overview of Recent Progress and Open Challenges.SIGMOD Rec.52, 4 (2023), 19–30
work page 2023
-
[63]
Narasayya, and Surajit Chaudhuri
Tarique Siddiqui, Wentao Wu, Vivek R. Narasayya, and Surajit Chaudhuri. 2022. DISTILL: Low-Overhead Data-Driven Techniques for Filtering and Costing In- dexes for Scalable Index Tuning.Proc. VLDB Endow.15, 10 (2022), 2019–2031
work page 2022
-
[64]
Zhaoyan Sun, Xuanhe Zhou, Guoliang Li, Xiang Yu, Jianhua Feng, and Yong Zhang. 2025. R-Bot: An LLM-based Query Rewrite System.Proc. VLDB Endow. 18, 12 (2025), 5031–5044
work page 2025
-
[65]
Jie Tan, Kangfei Zhao, Rui Li, Jeffrey Xu Yu, Chengzhi Piao, Hong Cheng, Helen Meng, Deli Zhao, and Yu Rong. 2025. Can Large Language Models Be Query Optimizer for Relational Databases?Proc. ACM Manag. Data3, 6 (2025), 1–28. https://doi.org/10.1145/3769771
-
[66]
Immanuel Trummer. 2024. DB-BERT: making database tuning tools "read" the manual.VLDB J.33, 4 (2024), 1085–1104
work page 2024
-
[67]
Gary Valentin et al. 2000. DB2 Advisor: An Optimizer Smart Enough to Recom- mend Its Own Indexes. InICDE. 101–110
work page 2000
-
[68]
Junxiong Wang et al. 2021. UDO: Universal Database Optimization using Rein- forcement Learning.Proc. VLDB Endow.14, 13 (2021), 3402–3414
work page 2021
-
[69]
Xiaoying Wang, Changbo Qu, Weiyuan Wu, Jiannan Wang, and Qingqing Zhou
-
[70]
VLDB Endow.14, 9 (2021), 1640–1654
Are We Ready For Learned Cardinality Estimation?Proc. VLDB Endow.14, 9 (2021), 1640–1654
work page 2021
-
[71]
Narasayya, and Surajit Chaudhuri
Xiaoying Wang, Wentao Wu, Vivek R. Narasayya, and Surajit Chaudhuri. 2025. Esc: An Early-Stopping Checker for Budget-aware Index Tuning.Proc. VLDB Endow.18, 5 (2025), 1278–1290
work page 2025
-
[72]
Narasayya, and Surajit Chaud- huri
Xiaoying Wang, Wentao Wu, Chi Wang, Vivek R. Narasayya, and Surajit Chaud- huri. 2024. Wii: Dynamic Budget Reallocation In Index Tuning.Proc. ACM Manag. Data2, 3 (2024), 182
work page 2024
-
[73]
Kyu-Young Whang. 1985. Index Selection in Relational Databases. InFoundations of Data Organization. 487–500
work page 1985
-
[74]
Wentao Wu. 2025. Hybrid Cost Modeling for Reducing Query Performance Regression in Index Tuning.IEEE Trans. Knowl. Data Eng.37, 1 (2025), 379–391
work page 2025
- [75]
-
[76]
Narasayya, and Surajit Chaudhuri
Wentao Wu, Anshuman Dutt, Gaoxiang Xu, Vivek R. Narasayya, and Surajit Chaudhuri. 2025. Understanding and Detecting Query Performance Regression in Practical Index Tuning: [Experiments & Analysis].Proc. ACM Manag. Data3, 6 (2025), 1–26
work page 2025
-
[77]
Wentao Wu, Jeffrey F. Naughton, and Harneet Singh. 2016. Sampling-Based Query Re-Optimization. InSIGMOD. 1721–1736
work page 2016
-
[78]
Narasayya, Surajit Chaudhuri, and Philip A
Wentao Wu, Chi Wang, Tarique Siddiqui, Junxiong Wang, Vivek R. Narasayya, Surajit Chaudhuri, and Philip A. Bernstein. 2022. Budget-aware Index Tuning with Reinforcement Learning. InSIGMOD. 1528–1541
work page 2022
-
[79]
Ritwik Yadav, Satyanarayana R. Valluri, and Mohamed Zaït. 2023. AIM: A practical approach to automated index management for SQL databases. InICDE. 3349–3362
work page 2023
-
[80]
An Yang, Anfeng Li, Baosong Yang, Beichen Zhang, Binyuan Hui, Bo Zheng, Bowen Yu, Chang Gao, Chengen Huang, Chenxu Lv, et al. 2025. Qwen3 technical report.arXiv preprint arXiv:2505.09388(2025)
work page internal anchor Pith review Pith/arXiv arXiv 2025
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.