CA-SQL achieves 51.72% execution accuracy on the challenging tier of the BIRD benchmark using GPT-4o-mini by scaling exploration breadth according to estimated task difficulty, evolutionary prompt seeding, and candidate voting.
Title resolution pending
2 Pith papers cite this work. Polarity classification is still indexing.
years
2026 2representative citing papers
A paired benchmark demonstrates that providing an explicit semantic layer document improves LLM accuracy on text-to-SQL tasks by 17-23 percentage points and eliminates meaningful differences between frontier models.
citing papers explorer
-
CA-SQL: Complexity-Aware Inference Time Reasoning for Text-to-SQL via Exploration and Compute Budget Allocation
CA-SQL achieves 51.72% execution accuracy on the challenging tier of the BIRD benchmark using GPT-4o-mini by scaling exploration breadth according to estimated task difficulty, evolutionary prompt seeding, and candidate voting.
-
Semantic Layers for Reliable LLM-Powered Data Analytics: A Paired Benchmark of Accuracy and Hallucination Across Three Frontier Models
A paired benchmark demonstrates that providing an explicit semantic layer document improves LLM accuracy on text-to-SQL tasks by 17-23 percentage points and eliminates meaningful differences between frontier models.