ACE-SQL jointly optimizes schema linking and SQL generation via RL with empirical credit assignment from execution-correct rollouts, achieving 65.3% greedy execution accuracy on BIRD Dev using 0.93k output tokens.
arXiv preprint arXiv:2503.02240 (2025)
8 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 8verdicts
UNVERDICTED 8representative citing papers
EXPO-SQL improves Text-to-SQL by using clause-level rewards derived from execution error messages and incremental clause execution instead of uniform query-level rewards.
SpotIt+ uses verification to find realistic counterexample databases that expose discrepancies between generated and gold SQL queries missed by standard test-based evaluation on the BIRD dataset.
EcoTable is the first NL-based data integration framework that builds a join-likelihood graph, uses two-stage schema linking and Steiner tree search to find paths, then generates transformations with LLMs, reporting >30% accuracy gain and 5x lower cost on four real-world datasets.
A selection technique based on separating instances and provenance outperforms baselines for choosing among 2-3 NL2SQL candidates on a BIRD-DEV subset without consistency scores.
FINER-SQL boosts 3B-parameter small language models to 67.73% and 85% execution accuracy on BIRD and Spider benchmarks via dense memory and atomic rewards in group relative policy optimization, matching larger LLMs at lower latency.
APMPO boosts average Pass@1 scores on math reasoning benchmarks by 3 points over GRPO by using an adaptive power-mean policy objective and feedback-driven clipping bounds in RLVR training.
FREIA applies free energy principles and adaptive advantage shaping to unsupervised RL, outperforming baselines by 0.5-3.5 Pass@1 points on math reasoning with a 1.5B model.
citing papers explorer
-
ACE-SQL: Adaptive Co-Optimization via Empirical Credit Assignment for Text-to-SQL
ACE-SQL jointly optimizes schema linking and SQL generation via RL with empirical credit assignment from execution-correct rollouts, achieving 65.3% greedy execution accuracy on BIRD Dev using 0.93k output tokens.
-
EXPO-SQL: Execution-based Clause-level Policy Optimization for Text-to-SQL
EXPO-SQL improves Text-to-SQL by using clause-level rewards derived from execution error messages and incremental clause execution instead of uniform query-level rewards.
-
SpotIt+: Verification-based Text-to-SQL Evaluation with Database Constraints
SpotIt+ uses verification to find realistic counterexample databases that expose discrepancies between generated and gold SQL queries missed by standard test-based evaluation on the BIRD dataset.
-
EcoTable: Cost-effective Table Integration in Data Lakes for Natural Language Queries
EcoTable is the first NL-based data integration framework that builds a join-likelihood graph, uses two-stage schema linking and Steiner tree search to find paths, then generates transformations with LLMs, reporting >30% accuracy gain and 5x lower cost on four real-world datasets.
-
Data-aware candidate selection in NL2SQL translation via small separating instances
A selection technique based on separating instances and provenance outperforms baselines for choosing among 2-3 NL2SQL candidates on a BIRD-DEV subset without consistency scores.
-
FINER-SQL: Boosting Small Language Models for Text-to-SQL
FINER-SQL boosts 3B-parameter small language models to 67.73% and 85% execution accuracy on BIRD and Spider benchmarks via dense memory and atomic rewards in group relative policy optimization, matching larger LLMs at lower latency.
-
Adapt to Thrive! Adaptive Power-Mean Policy Optimization for Improved LLM Reasoning
APMPO boosts average Pass@1 scores on math reasoning benchmarks by 3 points over GRPO by using an adaptive power-mean policy objective and feedback-driven clipping bounds in RLVR training.
-
Free Energy-Driven Reinforcement Learning with Adaptive Advantage Shaping for Unsupervised Reasoning in LLMs
FREIA applies free energy principles and adaptive advantage shaping to unsupervised RL, outperforming baselines by 0.5-3.5 Pass@1 points on math reasoning with a 1.5B model.