FACTS: Table Summarization via Offline Template Generation with Agentic Workflows

Mohammad Amin Shabani; Siqi Liu; Ye Yuan

arxiv: 2510.13920 · v2 · submitted 2025-10-15 · 💻 cs.CL

FACTS: Table Summarization via Offline Template Generation with Agentic Workflows

Ye Yuan , Mohammad Amin Shabani , Siqi Liu This is my paper

Pith reviewed 2026-05-18 07:19 UTC · model grok-4.3

classification 💻 cs.CL

keywords table summarizationquery-focused summarizationagentic workflowsSQL generationJinja2 templatesprivacy-preserving AIoffline template generationLLM agents

0 comments

The pith

FACTS generates reusable SQL queries and Jinja2 templates offline via an agentic workflow to produce fast, accurate, and privacy-compliant query-focused table summaries.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents FACTS as a method that uses an agentic workflow to create offline templates consisting of SQL queries and Jinja2 templates from table schemas. These templates can be reused across any tables sharing the same schema to render natural language summaries conditioned on a user query. This design avoids repeated large language model calls during inference, keeps sensitive table data private by sending only schemas to the model, and relies on executable SQL for accuracy. Evaluations on standard benchmarks indicate consistent gains over prior table-to-text, prompt-based, and agentic approaches.

Core claim

FACTS produces offline templates, consisting of SQL queries and Jinja2 templates, which can be rendered into natural language summaries and are reusable across multiple tables sharing the same schema. It enables fast summarization through reusable offline templates, accurate outputs with executable SQL queries, and privacy compliance by sending only table schemas to LLMs.

What carries the argument

Agentic workflow that generates executable SQL queries paired with Jinja2 templates from table schemas and queries alone, allowing offline rendering of summaries.

If this is right

Summaries can be produced at scale without per-query LLM calls once templates exist for a schema.
Only table schemas reach the language model, so raw data never leaves the local environment.
Executable SQL guarantees that numeric and relational facts in the summary match the source table.
New tables with matching schemas immediately support the same query templates without retraining.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same offline-template pattern could apply to other structured data formats such as JSON documents or graph databases if analogous query languages are substituted.
In high-volume production settings the approach would shift cost from inference time to one-time template creation.
Template libraries could be versioned and shared across organizations that use identical database schemas.

Load-bearing premise

An agentic workflow can reliably produce correct, reusable SQL queries and Jinja2 templates directly from table schemas without manual fixes or breakdowns on complex reasoning queries.

What would settle it

A benchmark query on tables with the same schema where the generated SQL produces incorrect aggregates or joins, or where the rendered Jinja2 output is factually wrong or unreadable.

Figures

Figures reproduced from arXiv: 2510.13920 by Mohammad Amin Shabani, Siqi Liu, Ye Yuan.

**Figure 2.** Figure 2: The FACTS framework for query-focused table summarization via Offline Template [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗

**Figure 3.** Figure 3: Reusability analysis. Runtime for generating summaries with 1 versus 100 tables under the same schema and query. 0 200 400 600 800 1000 Number of Rows in Tables 20 30 40 50 60 70 80 90 Seconds for Generating Summaries Reason-then-Summ SPaGe FACTS [PITH_FULL_IMAGE:figures/full_fig_p009_3.png] view at source ↗

read the original abstract

Query-focused table summarization requires generating natural language summaries of tabular data conditioned on a user query, enabling users to access insights beyond fact retrieval. Existing approaches face key limitations: table-to-text models require costly fine-tuning and struggle with complex reasoning, prompt-based LLM methods suffer from token-limit and efficiency issues while exposing sensitive data, and prior agentic pipelines often rely on decomposition, planning, or manual templates that lack robustness and scalability. To mitigate these issues, we introduce an agentic workflow, FACTS, a Fast, Accurate, and Privacy-Compliant Table Summarization approach via Offline Template Generation. FACTS produces offline templates, consisting of SQL queries and Jinja2 templates, which can be rendered into natural language summaries and are reusable across multiple tables sharing the same schema. It enables fast summarization through reusable offline templates, accurate outputs with executable SQL queries, and privacy compliance by sending only table schemas to LLMs. Evaluations on widely-used benchmarks show that FACTS consistently outperforms baseline methods, establishing it as a practical solution for real-world query-focused table summarization. Our code is available at https://github.com/BorealisAI/FACTS.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

FACTS is a practical engineering tweak that generates reusable SQL plus Jinja2 templates offline via agents for query-focused table summarization, but the robustness and evaluation details need closer checking.

read the letter

The main point is that FACTS builds an agentic workflow to produce offline templates from table schemas, where SQL handles the data extraction and Jinja2 turns it into natural language summaries. Once made, those templates can be reused across any table with the same schema without calling the LLM again or sending actual data rows. This targets the usual bottlenecks: no fine-tuning, no token bloat at runtime, and better privacy since only schemas reach the model. The code release on GitHub is useful if someone wants to test it directly. What stands out as new is the specific offline SQL-plus-Jinja2 combination inside the agent loop, which differs from the decomposition or manual-template baselines mentioned. It does a reasonable job laying out why prior agentic pipelines fell short on scalability. The central claim of consistent benchmark gains is stated clearly, though the abstract leaves the actual numbers and setup details for the full text. On the soft side, the stress-test worry about template correctness on complex queries holds some weight. If the agent struggles with joins, aggregations, or multi-step reasoning and needs frequent human fixes or retries, the reusability advantage shrinks fast. The paper would benefit from reporting generation success rates and failure modes rather than just final summary quality. This work is aimed at practitioners who need query-driven table insights in analytics pipelines or internal tools. A reader already working with LLMs on structured data could pick up the method and code without much overhead. It deserves peer review because the idea is concrete, the implementation is public, and the practical framing is clear, even if the experiments need tightening on robustness metrics.

Referee Report

2 major / 2 minor

Summary. The paper introduces FACTS, an agentic workflow for query-focused table summarization that generates reusable offline templates consisting of SQL queries and Jinja2 templates directly from table schemas. These templates are rendered at inference time to produce natural-language summaries, addressing limitations of fine-tuned table-to-text models (costly training, poor complex reasoning), prompt-based LLM methods (token limits, data exposure), and prior agentic pipelines (lack of robustness and scalability). The approach claims to deliver fast inference via reusability, accuracy via executable SQL, and privacy compliance by transmitting only schemas to the LLM. Evaluations on widely-used benchmarks are reported to show consistent outperformance over baselines, with code released at https://github.com/BorealisAI/FACTS.

Significance. If the empirical claims hold, the work offers a practical engineering advance for real-world table summarization systems, particularly where privacy constraints and reusability across schema-identical tables matter. The offline template paradigm could reduce repeated LLM calls and data exposure compared with online prompting approaches. Open-sourcing the code supports reproducibility, which strengthens the contribution for the NLP and data-to-text communities.

major comments (2)

[§4 and §5] §4 (Agentic Workflow) and §5 (Experiments): the central claim that FACTS is a 'practical solution for real-world query-focused table summarization' rests on the agentic pipeline reliably producing correct, executable SQL+Jinja2 templates from schemas alone. No quantitative results are supplied on template-generation success rate, average retry count, failure modes for queries requiring joins/aggregations/multi-table logic, or frequency of manual intervention. Without these metrics the reported benchmark gains cannot be attributed to the claimed robustness advantage over prior agentic methods.
[§5] §5 (Experimental Results): the abstract and results section assert 'consistent outperformance' on widely-used benchmarks, yet the manuscript supplies neither per-dataset scores with error bars, ablation studies isolating the contribution of offline template reuse versus the agentic generation step, nor details on query complexity distribution. This makes it impossible to verify whether the gains generalize beyond the evaluated tables or collapse on reasoning-heavy queries.

minor comments (2)

[Abstract and §1] The abstract and introduction repeatedly use the phrase 'widely-used benchmarks' without naming the specific datasets or providing citations in the first occurrence.
[§3] Notation for the generated artifacts (SQL query, Jinja2 template, rendered summary) is introduced informally; a small table or diagram in §3 would clarify the data flow.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback, which helps clarify the presentation of our contributions. We address each major comment below and commit to revisions that strengthen the empirical support for our claims without altering the core approach.

read point-by-point responses

Referee: [§4 and §5] §4 (Agentic Workflow) and §5 (Experiments): the central claim that FACTS is a 'practical solution for real-world query-focused table summarization' rests on the agentic pipeline reliably producing correct, executable SQL+Jinja2 templates from schemas alone. No quantitative results are supplied on template-generation success rate, average retry count, failure modes for queries requiring joins/aggregations/multi-table logic, or frequency of manual intervention. Without these metrics the reported benchmark gains cannot be attributed to the claimed robustness advantage over prior agentic methods.

Authors: We agree that direct quantitative metrics on the template-generation stage would strengthen attribution of the observed gains to the robustness of the agentic workflow. In the revised manuscript we will add a dedicated analysis (new subsection in §4 or appendix) reporting template-generation success rate, average retry counts, breakdown of failure modes (with emphasis on joins, aggregations, and multi-table logic), and frequency of manual intervention across the evaluated benchmarks. These additions will make the robustness advantage over prior agentic methods more transparent while preserving the focus on end-to-end summarization performance. revision: yes
Referee: [§5] §5 (Experimental Results): the abstract and results section assert 'consistent outperformance' on widely-used benchmarks, yet the manuscript supplies neither per-dataset scores with error bars, ablation studies isolating the contribution of offline template reuse versus the agentic generation step, nor details on query complexity distribution. This makes it impossible to verify whether the gains generalize beyond the evaluated tables or collapse on reasoning-heavy queries.

Authors: We acknowledge that more granular reporting would improve verifiability. In the revision we will expand §5 to include per-dataset scores with error bars, ablation studies that isolate the contribution of offline template reuse from the agentic generation step, and a characterization of query complexity distribution (e.g., proportion of queries requiring joins or aggregations). These additions will clarify generalization and performance on reasoning-heavy queries while retaining the overall benchmark comparisons. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical engineering method evaluated on external benchmarks

full rationale

The paper presents FACTS as an agentic workflow that generates reusable SQL queries and Jinja2 templates offline from table schemas, then evaluates the resulting summarization system on widely-used external benchmarks. No equations, fitted parameters, or derivation steps are described that could reduce to the method's own inputs. The central claims rest on benchmark outperformance and practical advantages (speed, accuracy, privacy) rather than any self-referential construction, self-citation chain, or ansatz smuggled via prior work. This is a standard empirical contribution with no load-bearing circular elements.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the unstated premise that large language models can generate correct and generalizable SQL-plus-Jinja2 templates from schema information alone and that these templates will remain accurate across tables sharing the same schema without post-hoc fixes.

axioms (1)

domain assumption LLMs can produce executable SQL queries and correct Jinja2 templates from table schemas that generalize to new tables with the same schema.
This premise is required for the offline template generation step to deliver both accuracy and reusability as claimed.

pith-pipeline@v0.9.0 · 5735 in / 1303 out tokens · 31071 ms · 2026-05-18T07:19:41.668757+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

FACTS produces offline templates, consisting of SQL queries and Jinja2 templates... Evaluations on widely-used benchmarks show that FACTS consistently outperforms baseline methods
IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We propose offline template generation, which produces reusable and schema-specific templates

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

25 extracted references · 25 canonical work pages

[1]

Yilun Zhao, Zhenting Qi, Linyong Nan, Boyu Mi, Yixin Liu, Weijin Zou, Simeng Han, Ruizhe Chen, Xiangru Tang, Yumo Xu, Dragomir Radev, and Arman Cohan

Association for Computational Linguistics. Yilun Zhao, Zhenting Qi, Linyong Nan, Boyu Mi, Yixin Liu, Weijin Zou, Simeng Han, Ruizhe Chen, Xiangru Tang, Yumo Xu, Dragomir Radev, and Arman Cohan. QTSumm: Query-focused sum- marization over tabular data. In Houda Bouamor, Juan Pino, and Kalika Bali (eds.),Proceedings of the 2023 Conference on Empirical Method...

work page 2023
[2]

Does it execute without errors?

work page
[3]

Does it return the non-empty data for summarization?

work page
[4]

If NO, provide a brief reason

Does it filter and select appropriate columns? Answer with YES or NO only. If NO, provide a brief reason. Output format: Decision: [YES/NO] Feedback: [Brief reason if NO, or ’SQL query is good’ if YES] Example 4: Prompt for evaluating SQL–template alignment. You are evaluating whether a SQL query result aligns with a Jinja2 template for table summarizatio...

work page
[5]

Does the SQL return all fields that the template tries to access?

work page
[6]

Is the data structure compatible (e.g., if template expects multiple rows, does SQL return them)?

work page
[7]

If NO, provide a brief reason

Are field names in the template matching the column names returned by SQL? Answer with YES or NO only. If NO, provide a brief reason. Output format: Decision: [YES/NO] Feedback: [Brief reason if NO, or ’SQL and template are well-aligned’ if YES] 16 Preprint Example 5: Prompt for evaluating generated summaries. You are evaluating a generated summary for ta...

work page
[8]

Relevance to the query

work page
[9]

Accuracy of information

work page
[10]

Clarity and coherence

work page
[11]

If NO, provide a brief reason

Completeness Answer with YES or NO only. If NO, provide a brief reason. Output format: Decision: [YES/NO] Feedback: [Brief reason if NO, or ’Summary is good’ if YES] A.2 PSEUDOCODE OFFACTS Algorithm 2 summarizes the FACTS workflow. The process begins withSchema-Guided Specifica- tion and Filtering, where the agent proposes schema-aware clarifying question...

work page
[12]

Is different from previously generated questions and filtering rules

work page
[13]

Clarifies what specific information is needed or what information is irrelevant

work page
[14]

Helps understand data relationships

work page
[15]

Based on the table information, user query, and refined questions below, generate a valid DuckDB SQL query

Guides the SQL query structure Output format: Specification: [Your single question or filtering rule here] Example 7: Prompt for SQL query synthesis. Based on the table information, user query, and refined questions below, generate a valid DuckDB SQL query. Table Information: [table schema here] Guided Specifications: [final set of guided questions and fi...

work page
[16]

Retrieves the necessary information to answer the user query

work page
[17]

Uses proper DuckDB syntax

work page
[18]

References the DataFrame as ’df’

work page
[19]

Quotes column names exactly as they appear

work page
[20]

Based on the demonstration examples below and the current SQL result, generate a Jinja2 template

Handles data types appropriately Output format: SQL queries: [Your SQL query here] Example 8: Prompt for Jinja2template generation. Based on the demonstration examples below and the current SQL result, generate a Jinja2 template. --- Demonstration Examples --- [table, user query, and reference summary triples] --- Current Task --- Table Information: [tabl...

work page
[21]

Uses the variable name ’values’ to access the data

work page
[22]

Iterates with {% for row in values %}

work page
[23]

Column Name

Accesses fields with row["Column Name"]

work page
[24]

Produces a coherent paragraph summary in the style of the examples

work page
[25]

Show all document names using templates with template type code BK

Handles empty results gracefully Output format: Jinja2 template: [Your Jinja2 template here] For space reasons, we only show these representative prompts here. The full set, including iterative improvement and alignment prompts, is available in our code release. 19 Preprint A.4 CASESTUDY: STEP-BY-STEPOUTPUTS ONQFMTS We illustrate FACTS end-to-end on a QFM...

work page

[1] [1]

Yilun Zhao, Zhenting Qi, Linyong Nan, Boyu Mi, Yixin Liu, Weijin Zou, Simeng Han, Ruizhe Chen, Xiangru Tang, Yumo Xu, Dragomir Radev, and Arman Cohan

Association for Computational Linguistics. Yilun Zhao, Zhenting Qi, Linyong Nan, Boyu Mi, Yixin Liu, Weijin Zou, Simeng Han, Ruizhe Chen, Xiangru Tang, Yumo Xu, Dragomir Radev, and Arman Cohan. QTSumm: Query-focused sum- marization over tabular data. In Houda Bouamor, Juan Pino, and Kalika Bali (eds.),Proceedings of the 2023 Conference on Empirical Method...

work page 2023

[2] [2]

Does it execute without errors?

work page

[3] [3]

Does it return the non-empty data for summarization?

work page

[4] [4]

If NO, provide a brief reason

Does it filter and select appropriate columns? Answer with YES or NO only. If NO, provide a brief reason. Output format: Decision: [YES/NO] Feedback: [Brief reason if NO, or ’SQL query is good’ if YES] Example 4: Prompt for evaluating SQL–template alignment. You are evaluating whether a SQL query result aligns with a Jinja2 template for table summarizatio...

work page

[5] [5]

Does the SQL return all fields that the template tries to access?

work page

[6] [6]

Is the data structure compatible (e.g., if template expects multiple rows, does SQL return them)?

work page

[7] [7]

If NO, provide a brief reason

Are field names in the template matching the column names returned by SQL? Answer with YES or NO only. If NO, provide a brief reason. Output format: Decision: [YES/NO] Feedback: [Brief reason if NO, or ’SQL and template are well-aligned’ if YES] 16 Preprint Example 5: Prompt for evaluating generated summaries. You are evaluating a generated summary for ta...

work page

[8] [8]

Relevance to the query

work page

[9] [9]

Accuracy of information

work page

[10] [10]

Clarity and coherence

work page

[11] [11]

If NO, provide a brief reason

Completeness Answer with YES or NO only. If NO, provide a brief reason. Output format: Decision: [YES/NO] Feedback: [Brief reason if NO, or ’Summary is good’ if YES] A.2 PSEUDOCODE OFFACTS Algorithm 2 summarizes the FACTS workflow. The process begins withSchema-Guided Specifica- tion and Filtering, where the agent proposes schema-aware clarifying question...

work page

[12] [12]

Is different from previously generated questions and filtering rules

work page

[13] [13]

Clarifies what specific information is needed or what information is irrelevant

work page

[14] [14]

Helps understand data relationships

work page

[15] [15]

Based on the table information, user query, and refined questions below, generate a valid DuckDB SQL query

Guides the SQL query structure Output format: Specification: [Your single question or filtering rule here] Example 7: Prompt for SQL query synthesis. Based on the table information, user query, and refined questions below, generate a valid DuckDB SQL query. Table Information: [table schema here] Guided Specifications: [final set of guided questions and fi...

work page

[16] [16]

Retrieves the necessary information to answer the user query

work page

[17] [17]

Uses proper DuckDB syntax

work page

[18] [18]

References the DataFrame as ’df’

work page

[19] [19]

Quotes column names exactly as they appear

work page

[20] [20]

Based on the demonstration examples below and the current SQL result, generate a Jinja2 template

Handles data types appropriately Output format: SQL queries: [Your SQL query here] Example 8: Prompt for Jinja2template generation. Based on the demonstration examples below and the current SQL result, generate a Jinja2 template. --- Demonstration Examples --- [table, user query, and reference summary triples] --- Current Task --- Table Information: [tabl...

work page

[21] [21]

Uses the variable name ’values’ to access the data

work page

[22] [22]

Iterates with {% for row in values %}

work page

[23] [23]

Column Name

Accesses fields with row["Column Name"]

work page

[24] [24]

Produces a coherent paragraph summary in the style of the examples

work page

[25] [25]

Show all document names using templates with template type code BK

Handles empty results gracefully Output format: Jinja2 template: [Your Jinja2 template here] For space reasons, we only show these representative prompts here. The full set, including iterative improvement and alignment prompts, is available in our code release. 19 Preprint A.4 CASESTUDY: STEP-BY-STEPOUTPUTS ONQFMTS We illustrate FACTS end-to-end on a QFM...

work page