Automatic Slide Updating with User-Defined Dynamic Templates and Natural Language Instructions
Pith reviewed 2026-05-10 04:08 UTC · model grok-4.3
The pith
The paper defines a task for updating user-designed slides from natural language instructions and provides a benchmark and agent system to do it.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Dynamic Slide Update via Natural Language Instructions on User-provided Templates is a new task that requires an agent to modify slide content according to instructions while strictly preserving the original layout and visual style. The DynaSlide benchmark supplies 20,036 triples of source slide, instruction, and target slide for training and evaluation. SlideAgent solves this by multimodal parsing of the slide, grounding the instruction in the slide elements, and employing tools for numerical and textual updates.
What carries the argument
SlideAgent, an agent-based system that merges multimodal slide parsing, natural language instruction grounding, and tool-augmented reasoning to perform updates on tables, charts, and text.
If this is right
- Slides can be refreshed automatically from instructions without recreating layouts.
- The shared database grounding allows consistent updates across related slides.
- Evaluation protocols show specific failure modes in content accuracy and layout fidelity.
- Future systems can build on this baseline for better performance on complex visuals.
Where Pith is reading between the lines
- This framework could extend to updating other visual documents like dashboards or infographics.
- Connecting the agent directly to live databases might enable real-time slide maintenance.
- Error patterns in the evaluations point to needs for better chart understanding modules.
Load-bearing premise
That existing multimodal and reasoning tools, when combined in an agent, can correctly interpret instructions and apply changes to arbitrary user slide designs without errors or style violations.
What would settle it
A test set of slides containing intricate charts and tables where SlideAgent produces outputs that deviate from the target slides in either data values or visual arrangement.
Figures
read the original abstract
Presentation slides are a primary medium for data-driven reporting, yet keeping complex, analytics-style decks up to date remains labor-intensive. Existing automation methods mostly follow fixed template filling and cannot support dynamic updates for diverse, user-authored slide decks. We therefore define "Dynamic Slide Update via Natural Language Instructions on User-provided Templates" and introduce DynaSlide, a large-scale benchmark with 20,036 real-world instruction-execution triples (source slide, user instruction, target slide) grounded in a shared external database and built from business reporting slides under bring-your-own-template (BYO-template) conditions. To tackle this task, we propose SlideAgent, an agent-based framework that combines multimodal slide parsing, natural language instruction grounding, and tool-augmented reasoning for tables, charts, and textual conclusions. SlideAgent updates content while preserving layout and style, providing a strong reference baseline on DynaSlide. We further design end-to-end and component-level evaluation protocols that reveal key challenges and opportunities for future research. The dataset and code are available at https://github.com/XiaoZhou2024/SlideAgent.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper defines the task of 'Dynamic Slide Update via Natural Language Instructions on User-provided Templates' and introduces DynaSlide, a benchmark of 20,036 real-world instruction-execution triples (source slide, instruction, target slide) constructed from business reporting slides under bring-your-own-template conditions and grounded in an external database. It proposes SlideAgent, an agent-based framework combining multimodal slide parsing, natural language instruction grounding, and tool-augmented reasoning for tables, charts, and text to update content while preserving layout and style, positions it as a strong baseline, and describes end-to-end and component-level evaluation protocols. The dataset and code are released.
Significance. If the framework's performance claims hold under rigorous validation, the work would be significant for practical automation of data-driven slide maintenance and for advancing multimodal agent research in document editing. The large-scale, real-world benchmark construction and public release of data/code are clear strengths that enable reproducibility and follow-on work.
major comments (2)
- [Evaluation protocols] Evaluation protocols section: the manuscript describes end-to-end and component-level evaluation protocols but supplies no quantitative results, error analysis, or explicit factual-accuracy audit against ground-truth target slides. This is load-bearing for the central claim that SlideAgent constitutes a 'strong reference baseline,' because any mismatch between parsed slide state and database state or errors in the agent's reasoning trace would directly falsify reliable update performance on diverse BYO-template decks.
- [SlideAgent framework] SlideAgent framework description: the claim that combining existing multimodal parsing, instruction grounding, and tool use reliably preserves layout/style without content errors or hallucinations rests on an untested assumption. No ablation studies or concrete evidence are provided to show that off-the-shelf components suffice for the diverse, user-authored slides in DynaSlide.
minor comments (1)
- [Abstract] Abstract: the phrase 'providing a strong reference baseline' is stated without reference to any specific metrics or results; adding a brief indication of key findings would improve clarity.
Simulated Author's Rebuttal
We thank the referee for the detailed and constructive review. The comments highlight important areas for strengthening the manuscript's claims about evaluation and the framework. We address each major comment below and commit to revisions that incorporate quantitative results, error analysis, and ablations.
read point-by-point responses
-
Referee: [Evaluation protocols] Evaluation protocols section: the manuscript describes end-to-end and component-level evaluation protocols but supplies no quantitative results, error analysis, or explicit factual-accuracy audit against ground-truth target slides. This is load-bearing for the central claim that SlideAgent constitutes a 'strong reference baseline,' because any mismatch between parsed slide state and database state or errors in the agent's reasoning trace would directly falsify reliable update performance on diverse BYO-template decks.
Authors: We agree that the absence of quantitative results, error analysis, and factual-accuracy audits weakens the support for claiming SlideAgent as a strong baseline. The current manuscript emphasizes the design of the protocols and the benchmark but does not present the actual performance numbers or audits. In the revised version, we will add the full quantitative results from our end-to-end and component-level evaluations (including accuracy metrics against ground-truth target slides), a detailed error analysis, and discussion of any parsing or reasoning mismatches. This will directly address the concern about potential falsification of performance claims. revision: yes
-
Referee: [SlideAgent framework] SlideAgent framework description: the claim that combining existing multimodal parsing, instruction grounding, and tool use reliably preserves layout/style without content errors or hallucinations rests on an untested assumption. No ablation studies or concrete evidence are provided to show that off-the-shelf components suffice for the diverse, user-authored slides in DynaSlide.
Authors: We acknowledge that the framework section relies on the integration of existing components without sufficient ablations or concrete evidence for the diverse BYO-template slides. While component-level evaluations are described, they do not include systematic ablations. In the revision, we will add ablation studies (e.g., removing individual modules like tool-augmented reasoning or multimodal parsing) and provide concrete examples and metrics demonstrating layout/style preservation, content accuracy, and handling of hallucinations on DynaSlide slides. This will replace the assumption with empirical support. revision: yes
Circularity Check
No circularity: task definition, external benchmark, and baseline framework are independent of self-referential inputs
full rationale
The paper defines the task of Dynamic Slide Update via Natural Language Instructions on User-provided Templates, constructs DynaSlide as 20,036 real-world triples (source slide, instruction, target slide) drawn from external business reporting slides under BYO-template conditions and grounded in a shared external database, and proposes SlideAgent as an agent framework that assembles existing multimodal parsing, instruction grounding, and tool-augmented reasoning components to serve as a reference baseline. Evaluation protocols are applied to this independently sourced benchmark. No equations, fitted parameters renamed as predictions, load-bearing self-citations, uniqueness theorems, or ansatzes appear in the derivation; the central claims rest on empirical results against external data rather than reducing to the paper's own inputs by construction.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption Multimodal models can accurately parse layout, text, tables, and charts from user-authored slides
- domain assumption Tool-augmented reasoning can correctly interpret natural language instructions and apply changes to slide content
invented entities (2)
-
DynaSlide benchmark
no independent evidence
-
SlideAgent framework
no independent evidence
Reference graph
Works this paper leans on
-
[1]
Cross-Analysis of New Housing Transaction Structure
Universal multimodal representation for lan- guage understanding.IEEE Transactions on Pat- tern Analysis and Machine Intelligence, 45(7):9169– 9185. Hao Zheng, Xinyan Guan, Hao Kong, Wenkai Zhang, Jia Zheng, Weixiang Zhou, Hongyu Lin, Yaojie Lu, Xianpei Han, and Le Sun. 2025. Pptagent: Gen- erating and evaluating presentations beyond text-to- slides. InPr...
work page 2025
-
[2]
For example: Generate an analysis of {city} {block} from {start year} to {end year}.”
Basic Replacement Instructions:Update only core variables (e.g., temporal spans, ge- ographic regions) while maintaining the sta- tistical functions and analytical dimensions. For example: Generate an analysis of {city} {block} from {start year} to {end year}.”
-
[3]
Customized-Parameter Instructions:Fur- ther modify constraint parameters (e.g., area segmentation, price granularity), triggering cascading updates across data queries and sta- tistical computations. For example: Change the area segmentation to area range and price granularity to price range.” To support the diversity of these two scenarios, we construct ...
work page 2020
-
[4]
Block Area Segment Distri- bution
-
[5]
Analysis of Block Area Division
-
[6]
Distribution of Block Area Segments Supply- Transaction Unit Statistics
-
[7]
Start_Year-End_Year Supply and Transaction Unit Statistics inCity’sBlock
-
[8]
City Block : Supply & Sales V olume Analysis,Start_Year- End_Year
-
[9]
Sold in City’s Block(Start_Year-End_Year)
Analysis of Property Units Supplied vs. Sold in City’s Block(Start_Year-End_Year)
-
[10]
From Start_Year to End_Year, Block’s core supply-demand area was Seg_Su pplyDemand_Core_Area m2, with the upgrade-oriented segment centered on Seg_SupplyDemand_Upgrade_Aream 2
-
[11]
Between Start_Year and End_Ye ar, the market structure in Block was defined by a core demand range of Seg_SupplyDemand_Core_Area m2 and an upgrade tier of Seg_SupplyDemand_Upgra de_Aream 2
-
[12]
The Block sector exhibited a dual- tier segmentation from Start_Year - End_Year : a primary volume cluster at Seg_SupplyDemand_Core_Area m2 and a secondary upgrade cluster at Seg_SupplyDemand_Upgrade_Aream 2. Continued on next page... Table 12 – continued from previous page Title Template Function Caption Template Summary Template
-
[15]
New Con- struction Inven- tory Structure Analysis Area × Price Cross Pivot
-
[16]
Start_Year -End_Year City Block Area and Total Price Cross Statistics
-
[17]
Cross-Analysis of Property Size and Price Points in City’s Block(Start_Year-End_Year)
-
[18]
City Block : Correlation between Unit Area and Total Price (Start_Year-End_Year)
-
[20]
Out of Total_Transaction_Units total transactions during Start_Year - End_Year, the peak velocity of Peak_S egment_Volume units occurred at the intersection of the Modal_Price_Segment price band and Modal_Area_Segment area band
-
[21]
The period Start_Year -End_Year saw Total_Transaction_Units total sales; the most active cross-segment was Modal_Price_Segment combined with Modal_Area_Segment , contributing Peak_Segment_Volumeunits
-
[24]
New Con- struction Inven- tory Structure Analysis Area Segment Distribution 1.Start_Year -End_Year City Block Total Area Segment Distribution Statistics
-
[25]
Distribution of Transactions by Property Size Segment in City’sBlock (Start_Year - End_Year)
-
[26]
City Block : Analysis of Market Share by Unit Area Brackets (Start_Year-End_Year)
-
[27]
Mainstream types concentrate in Dominant_Area_Segment segments, totaling Dominant_Area_Segment_Volume units
-
[28]
A volume of Dominant_Area_Segm ent_Volume units indicates that the Dominant_Area_Segment range represents the dominant area concentration
-
[29]
The Dominant_Area_Segment typology emerged as the mainstream segment, amassing a total of Dominant_Area_Segm ent_Volumeunits. Continued on next page... Table 12 – continued from previous page Title Template Function Caption Template Summary Template
-
[30]
New- House Cross- Structure Anal- ysis
-
[31]
New Resi- dential Portfo- lio Composi- tion
-
[32]
New Con- struction Inven- tory Structure Analysis Price Segment Distribution
-
[33]
Start_Year -End_Year City Block Total Price Segment Distribution Statistics
-
[34]
Distribution of Transactions by Price Point Segment in Ci ty’sBlock(Start_Year-End_Year)
-
[35]
Sales Breakdown by Price Range Categories for City’s Block,Start_Year-End_Year
-
[36]
Mainstream types concentrate in Dominant_Price_Segment segments, totaling Dominant_Price_Segment_Volume units
-
[37]
The Dominant_Price_Segment price bracket captured the ma- jority of interest, accumulating Dominant_Price_Segment_Volumeunits
-
[38]
With Dominant_Price_Segment_Volume units, the Dominant_Price_Segment segment constitutes the primary price concentration for the sector
-
[41]
Secondary Market Inven- tory Structure Study Area × Price Cross Pivot
-
[42]
Start_Year -End_Year City Block Resale House Area and Total Price Cross Statistics
-
[43]
Resale Market: Cross- Analysis of Property Size and Price in City’sBlock (Start_Year-End_Year)
-
[44]
Total Price in City’sBlock (Start_Year - End_Year)
Statistical Profile of Resale Homes by Area vs. Total Price in City’sBlock (Start_Year - End_Year)
-
[45]
From Start_Year to End_Year, a total of Total_Transaction_Units units were transacted, with the Modal_Price_Segm ent price segment and Modal_Area_Seg ment area segment having the highest transactions at Peak_Segment_Volume units
-
[46]
Resale activity for Start_Year-End_ Year totaled Total_Transaction_Units units, peaked by Peak_Segment_Volume sales in the Modal_Price_Segment / Modal_Area_Segmentcross-segment
-
[47]
The Modal_Price_Segment and Moda l_Area_Segment cohorts led the resale market with Peak_Segment_Volume units, driving a cumulative volume of Total_Transaction_Units
-
[50]
Secondary Market Inven- tory Structure Study Area Segment Distribution
-
[51]
Start_Year -End_Year City Block Resale House Total Area Segment Distribution Statistics
-
[52]
Resale Market Transaction Distribution by Property Size in City’sBlock (Start_Year - End_Year)
-
[53]
Breakdown of Existing Home Sales by Size Category in City’sBlock , Start_Year - End_Year
-
[54]
Mainstream types concentrate in the Dominant_Area_Segment segments, totaling Dominant_Area_Segment_Volume units
-
[55]
The resale inventory is heavily weighted in the Dominant_Area_Segment range, which accounts for Dominant_Are a_Segment_Volumeunits
-
[56]
Accounting for Dominant_Area_Se gment_Volume units, the Dominant_Are a_Segment category stands out as the primary resale typology. Continued on next page... Table 12 – continued from previous page Title Template Function Caption Template Summary Template
-
[57]
Resale- House Cross- Structure Anal- ysis
-
[58]
Resale Resi- dential Portfo- lio Assessment
-
[59]
Secondary Market Inven- tory Structure Study Price Segment Distribution
-
[60]
Start_Year -End_Year City Block Resale House Total Price Segment Distribution Statistics
-
[61]
Resale Market Transaction Distribution by Price Point in City’sBlock (Start_Year - End_Year)
-
[62]
Breakdown of Existing Home Sales by Price Range in City’sBlock , Start_Year - End_Year
-
[63]
Mainstream types concentrate in the Dominant_Price_Segment segments, totaling Dominant_Price_Segment_Volume units
-
[64]
The Dominant_Price_Segment price tier represents the core resale market, comprising Dominant_Price_Segment_Vol umeunits
-
[65]
A total of Dominant_Price_Segment_V olume resale units clustered within the Dominant_Price_Segment_Volume price band
-
[68]
Emerging Residential Market Scale Evaluation Historical Capacity Summary
-
[69]
City Block Historical Ca- pacity Summary Statistics (Start_Year-End_Year)
-
[70]
Historical Market V olume Summary for City’sBlock (Start_Year-End_Year)
-
[71]
Summary of Past Market Scale Statistics for City’sBlock, Start_Year-End_Year
-
[72]
From Start_Year to End_Year, Block’s traded area Area_Trend_Direction Total_Area_Change_Pct % from Base_P eriod_Traded_Area m2 to Terminal_Per iod_Traded_Area m2, and the average valuation also Price_Trend_Direction Total_Price_Change_Pct% from Base_Per iod_Avg_Price to Terminal_Period_Avg_ Priceyuan/m 2
-
[73]
Over the Start_Year-End_Year period, the sector saw traded area Area_Trend _Direction by Total_Area_Change_Pct % (moving from Base_Period_Traded_Area to Terminal_Period_Traded_Area m2), while valuations Price_Trend_Direction byTotal_Price_Change_Pctyuan/m 2
-
[74]
Starting at Base_Period_Traded_Area m2 and Base_Period_Avg_Price yuan/m2, the market Area_Trend_Direction to Terminal_Period_Traded_Area m2 and Terminal_Period_Avg_Price yuan/m2 respectively, marking a volume shift of Total_Area_Change_Pct % and a price shift of Absolute_Price_Change yuan/m2. Continued on next page... Table 12 – continued from previous pa...
-
[77]
Emerging Residential Market Scale Evaluation Annual Supply- Demand Comparison
-
[78]
City Block Annual Supply- Demand Comparison Analysis (Start_Year-End_Year)
-
[79]
City Block : Annual Com- parison of Market Supply and Transaction V olume (Start_Year-End_Year)
-
[80]
Analysis of Annual Supply- Demand Balance in City’s Block(Start_Year-End_Year)
-
[81]
From Start_Year to End_Year , new listings in this sector Supply_Trend_Dir ection from Base_Period_Supply_Units units to Terminal_Period_Supply_Units units (a Supply_Trend_Direction of Tota l_Supply_Change_Pct%), and transaction volume Transaction_Trend_Direction from Base_Period_Transaction_Units units to Terminal_Period_Transaction_Un its units (a Trans...
-
[82]
While listings Supply_Trend_Direc tion by Total_Supply_Change_Pct % (reaching Terminal_Period_Supply_Units units), transactions simultaneously Transaction_Trend_Direction by Total_ Transaction_Change_Pct% (ending at Te rminal_Period_Transaction_Units units) betweenStart_YearandEnd_Year
-
[83]
Comparing Start_Year to End_Ye ar, supply Supply_Trend_Direction to Terminal_Period_Supply_Units (Total_Supply_Change_Pct % Supp ly_Trend_Direction ), and demand Transaction_Trend_Direction to Te rminal_Period_Transaction_Units (Total_Transaction_Change_Pct % Transaction_Trend_Direction). Continued on next page... Table 12 – continued from previous page T...
-
[84]
New-House Market Capac- ity Analysis
-
[85]
New Con- struction V ol- ume & Supply Capacity
-
[86]
Emerging Residential Market Scale Evaluation Supply- Transaction Area
-
[87]
City Block : Historical Sup- ply and Transaction Area Statistics (Start_Year-End_Year)
-
[88]
Statistical Review of His- torical Supply and Transac- tion Area for City’sBlock (Start_Year-End_Year)
-
[89]
Sold Area in City’sBlock (Start_Year-End_Year)
Historical Data: Supplied vs. Sold Area in City’sBlock (Start_Year-End_Year)
-
[90]
From Start_Year to End_Year , in- ventory in this region Supply_Trend _Direction by Total_Supply_Chang e_Pct %, while the transaction area Transaction_Trend_Direction by Total_Transaction_Change_Pct%
-
[91]
Area-wise inventory Supply_Trend_D irection at a Total_Supply_Change_Pct% rate, contrasting with the transaction area which Transaction_Trend_Direct ion by Total_Transaction_Change_Pct % throughEnd_Year
-
[92]
The region experienced a Total_Supp ly_Change_Pct% Supply_Trend_Direction in supply area and a Total_Transactio n_Change_Pct% Transaction_Trend_Dire ction in sold area between Start_Year andEnd_Year
-
[95]
Secondary Market Stock & Unit Compo- sition Historical Delivery Metrics
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.