pith. machine review for the scientific record. sign in

arxiv: 2504.15077 · v5 · submitted 2025-04-21 · 💻 cs.LG · cs.DB

Recognition: unknown

Think2SQL: Reinforce LLM Reasoning Capabilities for Text2SQL

Authors on Pith no claims yet
classification 💻 cs.LG cs.DB
keywords modelsreasoningtext2sqllanguagelearningllmsrewardsmall
0
0 comments X
read the original abstract

Large Language Models (LLMs) can translate natural language into SQL, but small models struggle with multi-table and complex queries in Zero-Shot Learning (ZSL) settings. While Supervised Fine-Tuning (SFT) helps, it falls short for harder cases. To address this, we study how different reasoning strategies (general-purpose reasoning in ZSL, reasoning traces in SFT, and Reinforcement Learning with Verifiable Reward (RLVR) with novel reward functions) affect Text2SQL performance across four benchmarks. We show that partial scoring rewards, computed via SQL execution, are crucial for guiding models even when outputs are not fully correct. These fine-grained signals lead to consistently better Text2SQL outcomes. Small LLMs benefit most from reasoning-aware SFT and RL, with the 14B Qwen-Coder-2.5 surpassing 400B+ models on challenging datasets like BIRD.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. SecureMCP: A Policy-Enforced LLM Data Access Framework for AIoT Systems via Model Context Protocol

    cs.CR 2026-05 unverdicted novelty 5.0

    SecureMCP integrates RBAC with five sequential defense modules in an MCP server to achieve 82.3% policy compliance against adversarial LLM SQL queries in AIoT while preserving execution accuracy.