pith. sign in

arxiv: 2510.13920 · v2 · submitted 2025-10-15 · 💻 cs.CL

FACTS: Table Summarization via Offline Template Generation with Agentic Workflows

Pith reviewed 2026-05-18 07:19 UTC · model grok-4.3

classification 💻 cs.CL
keywords table summarizationquery-focused summarizationagentic workflowsSQL generationJinja2 templatesprivacy-preserving AIoffline template generationLLM agents
0
0 comments X

The pith

FACTS generates reusable SQL queries and Jinja2 templates offline via an agentic workflow to produce fast, accurate, and privacy-compliant query-focused table summaries.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents FACTS as a method that uses an agentic workflow to create offline templates consisting of SQL queries and Jinja2 templates from table schemas. These templates can be reused across any tables sharing the same schema to render natural language summaries conditioned on a user query. This design avoids repeated large language model calls during inference, keeps sensitive table data private by sending only schemas to the model, and relies on executable SQL for accuracy. Evaluations on standard benchmarks indicate consistent gains over prior table-to-text, prompt-based, and agentic approaches.

Core claim

FACTS produces offline templates, consisting of SQL queries and Jinja2 templates, which can be rendered into natural language summaries and are reusable across multiple tables sharing the same schema. It enables fast summarization through reusable offline templates, accurate outputs with executable SQL queries, and privacy compliance by sending only table schemas to LLMs.

What carries the argument

Agentic workflow that generates executable SQL queries paired with Jinja2 templates from table schemas and queries alone, allowing offline rendering of summaries.

If this is right

  • Summaries can be produced at scale without per-query LLM calls once templates exist for a schema.
  • Only table schemas reach the language model, so raw data never leaves the local environment.
  • Executable SQL guarantees that numeric and relational facts in the summary match the source table.
  • New tables with matching schemas immediately support the same query templates without retraining.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same offline-template pattern could apply to other structured data formats such as JSON documents or graph databases if analogous query languages are substituted.
  • In high-volume production settings the approach would shift cost from inference time to one-time template creation.
  • Template libraries could be versioned and shared across organizations that use identical database schemas.

Load-bearing premise

An agentic workflow can reliably produce correct, reusable SQL queries and Jinja2 templates directly from table schemas without manual fixes or breakdowns on complex reasoning queries.

What would settle it

A benchmark query on tables with the same schema where the generated SQL produces incorrect aggregates or joins, or where the rendered Jinja2 output is factually wrong or unreadable.

Figures

Figures reproduced from arXiv: 2510.13920 by Mohammad Amin Shabani, Siqi Liu, Ye Yuan.

Figure 1
Figure 1. Figure 1: Comparison between DirectSumm (Zhang et al., 2024) (left) and our proposed FACTS [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: The FACTS framework for query-focused table summarization via Offline Template [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Reusability analysis. Runtime for generating summaries with 1 versus 100 ta￾bles under the same schema and query. 0 200 400 600 800 1000 Number of Rows in Tables 20 30 40 50 60 70 80 90 Seconds for Generating Summaries Reason-then-Summ SPaGe FACTS [PITH_FULL_IMAGE:figures/full_fig_p009_3.png] view at source ↗
read the original abstract

Query-focused table summarization requires generating natural language summaries of tabular data conditioned on a user query, enabling users to access insights beyond fact retrieval. Existing approaches face key limitations: table-to-text models require costly fine-tuning and struggle with complex reasoning, prompt-based LLM methods suffer from token-limit and efficiency issues while exposing sensitive data, and prior agentic pipelines often rely on decomposition, planning, or manual templates that lack robustness and scalability. To mitigate these issues, we introduce an agentic workflow, FACTS, a Fast, Accurate, and Privacy-Compliant Table Summarization approach via Offline Template Generation. FACTS produces offline templates, consisting of SQL queries and Jinja2 templates, which can be rendered into natural language summaries and are reusable across multiple tables sharing the same schema. It enables fast summarization through reusable offline templates, accurate outputs with executable SQL queries, and privacy compliance by sending only table schemas to LLMs. Evaluations on widely-used benchmarks show that FACTS consistently outperforms baseline methods, establishing it as a practical solution for real-world query-focused table summarization. Our code is available at https://github.com/BorealisAI/FACTS.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper introduces FACTS, an agentic workflow for query-focused table summarization that generates reusable offline templates consisting of SQL queries and Jinja2 templates directly from table schemas. These templates are rendered at inference time to produce natural-language summaries, addressing limitations of fine-tuned table-to-text models (costly training, poor complex reasoning), prompt-based LLM methods (token limits, data exposure), and prior agentic pipelines (lack of robustness and scalability). The approach claims to deliver fast inference via reusability, accuracy via executable SQL, and privacy compliance by transmitting only schemas to the LLM. Evaluations on widely-used benchmarks are reported to show consistent outperformance over baselines, with code released at https://github.com/BorealisAI/FACTS.

Significance. If the empirical claims hold, the work offers a practical engineering advance for real-world table summarization systems, particularly where privacy constraints and reusability across schema-identical tables matter. The offline template paradigm could reduce repeated LLM calls and data exposure compared with online prompting approaches. Open-sourcing the code supports reproducibility, which strengthens the contribution for the NLP and data-to-text communities.

major comments (2)
  1. [§4 and §5] §4 (Agentic Workflow) and §5 (Experiments): the central claim that FACTS is a 'practical solution for real-world query-focused table summarization' rests on the agentic pipeline reliably producing correct, executable SQL+Jinja2 templates from schemas alone. No quantitative results are supplied on template-generation success rate, average retry count, failure modes for queries requiring joins/aggregations/multi-table logic, or frequency of manual intervention. Without these metrics the reported benchmark gains cannot be attributed to the claimed robustness advantage over prior agentic methods.
  2. [§5] §5 (Experimental Results): the abstract and results section assert 'consistent outperformance' on widely-used benchmarks, yet the manuscript supplies neither per-dataset scores with error bars, ablation studies isolating the contribution of offline template reuse versus the agentic generation step, nor details on query complexity distribution. This makes it impossible to verify whether the gains generalize beyond the evaluated tables or collapse on reasoning-heavy queries.
minor comments (2)
  1. [Abstract and §1] The abstract and introduction repeatedly use the phrase 'widely-used benchmarks' without naming the specific datasets or providing citations in the first occurrence.
  2. [§3] Notation for the generated artifacts (SQL query, Jinja2 template, rendered summary) is introduced informally; a small table or diagram in §3 would clarify the data flow.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback, which helps clarify the presentation of our contributions. We address each major comment below and commit to revisions that strengthen the empirical support for our claims without altering the core approach.

read point-by-point responses
  1. Referee: [§4 and §5] §4 (Agentic Workflow) and §5 (Experiments): the central claim that FACTS is a 'practical solution for real-world query-focused table summarization' rests on the agentic pipeline reliably producing correct, executable SQL+Jinja2 templates from schemas alone. No quantitative results are supplied on template-generation success rate, average retry count, failure modes for queries requiring joins/aggregations/multi-table logic, or frequency of manual intervention. Without these metrics the reported benchmark gains cannot be attributed to the claimed robustness advantage over prior agentic methods.

    Authors: We agree that direct quantitative metrics on the template-generation stage would strengthen attribution of the observed gains to the robustness of the agentic workflow. In the revised manuscript we will add a dedicated analysis (new subsection in §4 or appendix) reporting template-generation success rate, average retry counts, breakdown of failure modes (with emphasis on joins, aggregations, and multi-table logic), and frequency of manual intervention across the evaluated benchmarks. These additions will make the robustness advantage over prior agentic methods more transparent while preserving the focus on end-to-end summarization performance. revision: yes

  2. Referee: [§5] §5 (Experimental Results): the abstract and results section assert 'consistent outperformance' on widely-used benchmarks, yet the manuscript supplies neither per-dataset scores with error bars, ablation studies isolating the contribution of offline template reuse versus the agentic generation step, nor details on query complexity distribution. This makes it impossible to verify whether the gains generalize beyond the evaluated tables or collapse on reasoning-heavy queries.

    Authors: We acknowledge that more granular reporting would improve verifiability. In the revision we will expand §5 to include per-dataset scores with error bars, ablation studies that isolate the contribution of offline template reuse from the agentic generation step, and a characterization of query complexity distribution (e.g., proportion of queries requiring joins or aggregations). These additions will clarify generalization and performance on reasoning-heavy queries while retaining the overall benchmark comparisons. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical engineering method evaluated on external benchmarks

full rationale

The paper presents FACTS as an agentic workflow that generates reusable SQL queries and Jinja2 templates offline from table schemas, then evaluates the resulting summarization system on widely-used external benchmarks. No equations, fitted parameters, or derivation steps are described that could reduce to the method's own inputs. The central claims rest on benchmark outperformance and practical advantages (speed, accuracy, privacy) rather than any self-referential construction, self-citation chain, or ansatz smuggled via prior work. This is a standard empirical contribution with no load-bearing circular elements.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the unstated premise that large language models can generate correct and generalizable SQL-plus-Jinja2 templates from schema information alone and that these templates will remain accurate across tables sharing the same schema without post-hoc fixes.

axioms (1)
  • domain assumption LLMs can produce executable SQL queries and correct Jinja2 templates from table schemas that generalize to new tables with the same schema.
    This premise is required for the offline template generation step to deliver both accuracy and reusability as claimed.

pith-pipeline@v0.9.0 · 5735 in / 1303 out tokens · 31071 ms · 2026-05-18T07:19:41.668757+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

25 extracted references · 25 canonical work pages

  1. [1]

    Yilun Zhao, Zhenting Qi, Linyong Nan, Boyu Mi, Yixin Liu, Weijin Zou, Simeng Han, Ruizhe Chen, Xiangru Tang, Yumo Xu, Dragomir Radev, and Arman Cohan

    Association for Computational Linguistics. Yilun Zhao, Zhenting Qi, Linyong Nan, Boyu Mi, Yixin Liu, Weijin Zou, Simeng Han, Ruizhe Chen, Xiangru Tang, Yumo Xu, Dragomir Radev, and Arman Cohan. QTSumm: Query-focused sum- marization over tabular data. In Houda Bouamor, Juan Pino, and Kalika Bali (eds.),Proceedings of the 2023 Conference on Empirical Method...

  2. [2]

    Does it execute without errors?

  3. [3]

    Does it return the non-empty data for summarization?

  4. [4]

    If NO, provide a brief reason

    Does it filter and select appropriate columns? Answer with YES or NO only. If NO, provide a brief reason. Output format: Decision: [YES/NO] Feedback: [Brief reason if NO, or ’SQL query is good’ if YES] Example 4: Prompt for evaluating SQL–template alignment. You are evaluating whether a SQL query result aligns with a Jinja2 template for table summarizatio...

  5. [5]

    Does the SQL return all fields that the template tries to access?

  6. [6]

    Is the data structure compatible (e.g., if template expects multiple rows, does SQL return them)?

  7. [7]

    If NO, provide a brief reason

    Are field names in the template matching the column names returned by SQL? Answer with YES or NO only. If NO, provide a brief reason. Output format: Decision: [YES/NO] Feedback: [Brief reason if NO, or ’SQL and template are well-aligned’ if YES] 16 Preprint Example 5: Prompt for evaluating generated summaries. You are evaluating a generated summary for ta...

  8. [8]

    Relevance to the query

  9. [9]

    Accuracy of information

  10. [10]

    Clarity and coherence

  11. [11]

    If NO, provide a brief reason

    Completeness Answer with YES or NO only. If NO, provide a brief reason. Output format: Decision: [YES/NO] Feedback: [Brief reason if NO, or ’Summary is good’ if YES] A.2 PSEUDOCODE OFFACTS Algorithm 2 summarizes the FACTS workflow. The process begins withSchema-Guided Specifica- tion and Filtering, where the agent proposes schema-aware clarifying question...

  12. [12]

    Is different from previously generated questions and filtering rules

  13. [13]

    Clarifies what specific information is needed or what information is irrelevant

  14. [14]

    Helps understand data relationships

  15. [15]

    Based on the table information, user query, and refined questions below, generate a valid DuckDB SQL query

    Guides the SQL query structure Output format: Specification: [Your single question or filtering rule here] Example 7: Prompt for SQL query synthesis. Based on the table information, user query, and refined questions below, generate a valid DuckDB SQL query. Table Information: [table schema here] Guided Specifications: [final set of guided questions and fi...

  16. [16]

    Retrieves the necessary information to answer the user query

  17. [17]

    Uses proper DuckDB syntax

  18. [18]

    References the DataFrame as ’df’

  19. [19]

    Quotes column names exactly as they appear

  20. [20]

    Based on the demonstration examples below and the current SQL result, generate a Jinja2 template

    Handles data types appropriately Output format: SQL queries: [Your SQL query here] Example 8: Prompt for Jinja2template generation. Based on the demonstration examples below and the current SQL result, generate a Jinja2 template. --- Demonstration Examples --- [table, user query, and reference summary triples] --- Current Task --- Table Information: [tabl...

  21. [21]

    Uses the variable name ’values’ to access the data

  22. [22]

    Iterates with {% for row in values %}

  23. [23]

    Column Name

    Accesses fields with row["Column Name"]

  24. [24]

    Produces a coherent paragraph summary in the style of the examples

  25. [25]

    Show all document names using templates with template type code BK

    Handles empty results gracefully Output format: Jinja2 template: [Your Jinja2 template here] For space reasons, we only show these representative prompts here. The full set, including iterative improvement and alignment prompts, is available in our code release. 19 Preprint A.4 CASESTUDY: STEP-BY-STEPOUTPUTS ONQFMTS We illustrate FACTS end-to-end on a QFM...