Semantic Prompting: Agentic Incremental Narrative Refinement through Spatial Semantic Interaction

Chris North; Eric Krokos; Ibrahim Tahmid; Kirsten Whitley; Xuan Wang; Xuxin Tang

arxiv: 2604.19971 · v2 · pith:CTCTZR27new · submitted 2026-04-21 · 💻 cs.HC · cs.AI

Semantic Prompting: Agentic Incremental Narrative Refinement through Spatial Semantic Interaction

Xuxin Tang , Ibrahim Tahmid , Eric Krokos , Kirsten Whitley , Xuan Wang , Chris North This is my paper

Pith reviewed 2026-05-10 01:16 UTC · model grok-4.3

classification 💻 cs.HC cs.AI

keywords semantic promptingspatial layoutsnarrative refinementhuman-AI interactionincremental sensemakingLLM steeringintent alignmentinteractive refinement

0 comments

The pith

Semantic Prompting lets LLMs interpret spatial layout changes to make targeted narrative revisions instead of full regenerations.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces Semantic Prompting to close three gaps in using LLMs with spatial layouts for narrative generation: misalignment between user spatial interactions and model revisions, misalignment between human intent and model output, and insufficient control over fine details. The approach works by having the model detect semantic relationships among positioned elements, infer what refinement the user wants, and apply only the necessary changes to the text. A user study with fourteen participants found that people could steer the process incrementally through spatial moves and that the results felt more precise and aligned with their goals. This matters for sensemaking tasks where information organization happens gradually in space rather than in one-shot text prompts. If the method holds, it keeps the evolving spatial structure intact while updating only the relevant story parts.

Core claim

Semantic Prompting is a framework for spatial refinement that perceives semantic interactions, reasons about refinement intent, and performs targeted positional revisions. Implemented as S-PRISM, the system enhances the precision of interaction-revision refinement and supports incremental formalization through interactive steering, as shown in an empirical evaluation and a user study with fourteen participants who valued its efficient, adaptable, and trustworthy support for strengthening human-LLM intent alignment.

What carries the argument

Semantic Prompting framework that perceives semantic interactions from spatial layouts, reasons about the user's refinement intent, and executes targeted positional revisions to the generated narrative.

If this is right

Interaction-revision refinement achieves higher precision than collage-based or full regeneration methods.
Users perform incremental formalization of narratives through direct interactive steering of spatial elements.
Human-LLM intent alignment improves because revisions stay local and responsive to layout semantics.
The resulting support feels efficient, adaptable, and trustworthy to participants in sensemaking workflows.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Similar spatial-semantic steering could apply to non-narrative tasks such as refining data summaries or organizing research notes.
Over repeated sessions the spatial history might serve as a persistent record of how the narrative evolved.
Integration with existing visualization or mind-mapping tools could let users treat layout changes as the primary control surface for AI assistance.

Load-bearing premise

LLMs can accurately perceive semantic interactions from spatial layouts and reason about refinement intent without persistent human-LLM misalignment.

What would settle it

A test measuring the percentage of cases where specific spatial adjustments by users produce narrative updates that match their stated refinement intentions, compared against full text regeneration baselines.

Figures

Figures reproduced from arXiv: 2604.19971 by Chris North, Eric Krokos, Ibrahim Tahmid, Kirsten Whitley, Xuan Wang, Xuxin Tang.

**Figure 2.** Figure 2: S-PRISM’s multi-agent pipeline. Users first in [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗

**Figure 3.** Figure 3: The S-PRISM interface. (A) A direct-manipulation zoomable workspace for spatial document organization and [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗

**Figure 4.** Figure 4: Phase I: An overview of the four tasks (T), il [PITH_FULL_IMAGE:figures/full_fig_p006_4.png] view at source ↗

**Figure 5.** Figure 5: Phase I Behavior Patterns. (a) Average correct [PITH_FULL_IMAGE:figures/full_fig_p006_5.png] view at source ↗

**Figure 6.** Figure 6: Subjective Ratings. 7-point Likert scores for [PITH_FULL_IMAGE:figures/full_fig_p007_6.png] view at source ↗

**Figure 7.** Figure 7: P4 refined the workspace for refining reports [PITH_FULL_IMAGE:figures/full_fig_p008_7.png] view at source ↗

read the original abstract

Interactive spatial layouts empower users to synthesize information and organize findings for sensemaking. While Large Language Models (LLMs) can automate narrative generation from spatial layouts, current collage-based and re-generation methods struggle to support the incremental spatial refinements inherent to the sensemaking process. We identify three critical gaps in existing spatial-textual generation: interaction-revision misalignment, human-LLM intent misalignment, and lack of granular customization. To address these, we introduce Semantic Prompting, a framework for spatial refinement that perceives semantic interactions, reasons about refinement intent, and performs targeted positional revisions. We implemented S-PRISM to realize this framework. The empirical evaluation demonstrated that S-PRISM effectively enhanced the precision of interaction-revision refinement. A user study ($N=14$) highlighted how participants leveraged S-PRISM for incremental formalization through interactive steering. Results showed that users valued its efficient, adaptable, and trustworthy support, which effectively strengthens human-LLM intent alignment.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Semantic Prompting gives a workable way to handle incremental spatial refinements with LLMs, but the N=14 study only shows subjective approval without objective metrics or baselines.

read the letter

The paper introduces Semantic Prompting as a framework to let users steer LLM narrative generation through spatial layout changes instead of full regenerations or collages. It targets three gaps: interaction-revision misalignment, human-LLM intent gaps, and coarse control. S-PRISM implements this by extracting semantic relations from positions and applying targeted revisions. That framing matches real sensemaking workflows where people tweak layouts gradually, and the approach avoids restarting the whole output each time. The user study reports that the 14 participants used it for incremental formalization and found it efficient and trustworthy, which aligns with the design goals. The framework itself looks like a straightforward extension of existing spatial interfaces with LLM reasoning layered on top. The main weakness is the evaluation. The claim of enhanced precision rests on that small study, yet no quantitative measures appear—no edit distances, semantic similarity scores, revision counts, baseline comparisons, or statistical tests. We only get qualitative feedback on perceived value. Without those, it is hard to separate the framework's contribution from the appeal of any new interface. The assumption that the LLM reliably infers intent from spatial semantics is central but not directly tested against alternatives. This work fits HCI researchers building tools that combine spatial organization with generative models. Readers working on prompt strategies or visualization sensemaking could pick up the framework as a concrete starting point. It deserves peer review because the problem is real and the idea is implementable, though the authors will need to add controlled comparisons and objective outcomes to strengthen the results.

Referee Report

2 major / 2 minor

Summary. The paper proposes Semantic Prompting, a framework implemented as S-PRISM, to enable agentic incremental narrative refinement from spatial layouts. It identifies three gaps in existing collage-based and regeneration methods (interaction-revision misalignment, human-LLM intent misalignment, and lack of granular customization), then claims that perceiving semantic interactions, reasoning about refinement intent, and performing targeted positional revisions addresses them. An empirical evaluation is said to demonstrate enhanced precision of interaction-revision refinement, while a user study (N=14) reports that participants leveraged the system for incremental formalization and valued its efficient, adaptable, and trustworthy support.

Significance. If the central claims hold, the work could meaningfully advance HCI research on LLM-assisted sensemaking by shifting from one-shot generation to incremental, spatially steered refinement. The framework's emphasis on targeted revisions and intent alignment offers a concrete alternative to current spatial-textual pipelines; a reproducible implementation and falsifiable user-study protocol would strengthen its contribution.

major comments (2)

[Abstract and Evaluation] Abstract and Evaluation section: the claim that S-PRISM 'effectively enhanced the precision of interaction-revision refinement' is unsupported by any quantitative metric (e.g., edit distance to target narrative, semantic similarity, revision count, or inter-rater agreement), baseline condition, or statistical test. The N=14 study reports only subjective preference and 'leveraged for incremental formalization,' which does not establish a measurable precision gain over collage or regeneration methods.
[Framework and User Study] Framework and User Study sections: the core assumption that the LLM can reliably extract semantic interactions from spatial layouts and infer refinement intent without persistent misalignment is never independently tested. The study records only post-hoc user valuation of 'trustworthy support'; no objective probe (e.g., alignment error rate or comparison of LLM-inferred vs. user-intended revisions) is described, leaving the headline result vulnerable to interface-novelty confounds.

minor comments (2)

[Abstract] The abstract contains LaTeX markup ($N=14$) that should be rendered consistently for journal submission.
[Implementation] No explicit description of the spatial encoding scheme or prompting template used in S-PRISM is provided, making the implementation details difficult to reproduce.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive and detailed feedback, which has helped us identify areas where the manuscript can be strengthened. We address each major comment point by point below, indicating where revisions will be made to the next version of the paper.

read point-by-point responses

Referee: [Abstract and Evaluation] Abstract and Evaluation section: the claim that S-PRISM 'effectively enhanced the precision of interaction-revision refinement' is unsupported by any quantitative metric (e.g., edit distance to target narrative, semantic similarity, revision count, or inter-rater agreement), baseline condition, or statistical test. The N=14 study reports only subjective preference and 'leveraged for incremental formalization,' which does not establish a measurable precision gain over collage or regeneration methods.

Authors: We agree that the current abstract and evaluation presentation does not include explicit quantitative metrics, baselines, or statistical tests to support the precision enhancement claim. The empirical evaluation is grounded in the user study's demonstration of incremental formalization, but we acknowledge this leaves the claim open to the concerns raised. In the revised manuscript, we will expand the Evaluation section to report quantitative measures including edit distances to target narratives, semantic similarity scores, revision counts, and direct comparisons to collage-based and regeneration baselines, accompanied by statistical tests. The abstract will be updated to reflect these additions accurately without overstating the current results. revision: yes
Referee: [Framework and User Study] Framework and User Study sections: the core assumption that the LLM can reliably extract semantic interactions from spatial layouts and infer refinement intent without persistent misalignment is never independently tested. The study records only post-hoc user valuation of 'trustworthy support'; no objective probe (e.g., alignment error rate or comparison of LLM-inferred vs. user-intended revisions) is described, leaving the headline result vulnerable to interface-novelty confounds.

Authors: The referee is correct that the manuscript validates the framework's assumptions on semantic interaction extraction and intent inference primarily through post-hoc user feedback on trustworthiness rather than through dedicated, independent objective probes. This indirect approach supports the practical utility for incremental formalization but does not fully isolate alignment performance or rule out novelty effects. To address this, the revised version will add an objective evaluation subsection describing alignment error rates (via comparison of LLM-inferred revisions to user-specified ground truth) and a baseline condition to control for interface novelty. These additions will be integrated into the Framework and User Study sections. revision: yes

Circularity Check

0 steps flagged

No circularity; framework and user study are self-contained without derivations or self-referential reductions

full rationale

The paper identifies three gaps in spatial-textual generation, introduces the Semantic Prompting framework to perceive semantic interactions and perform targeted revisions, implements it as S-PRISM, and evaluates via a descriptive user study (N=14) reporting subjective valuation of efficiency and alignment. No equations, parameter fittings, uniqueness theorems, or load-bearing self-citations appear in the provided text. The central claims about precision enhancement and incremental formalization rest on the framework description and study observations rather than reducing by construction to inputs or prior author work. This is a standard non-circular HCI framework paper.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

Based solely on abstract; the framework rests on assumptions about LLM capabilities for semantic perception and intent reasoning, with no free parameters or invented entities explicitly quantified.

axioms (1)

domain assumption LLMs can perceive semantic interactions from spatial layouts and reason about refinement intent to perform targeted revisions.
This is the core premise enabling the Semantic Prompting framework as described.

invented entities (1)

Semantic Prompting framework no independent evidence
purpose: To bridge spatial interactions with LLM narrative refinements for incremental sensemaking.
Newly proposed method without independent evidence outside the paper's evaluation.

pith-pipeline@v0.9.0 · 5472 in / 1179 out tokens · 42379 ms · 2026-05-10T01:16:43.451059+00:00 · methodology

discussion (0)

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Drag, Infer, Reproject: Grounding LLMs through Spatial Interaction for Image Clustering
cs.HC 2026-06 unverdicted novelty 6.0

CriterionSI infers clustering criteria from incremental user drags via LLMs and applies them to steer image reprojections, shown effective in simulation and usage scenarios.
Drag, Infer, Reproject: Grounding LLMs through Spatial Interaction for Image Clustering
cs.HC 2026-06 unverdicted novelty 6.0

CriterionSI infers clustering criteria from sequential user drags via LLMs to produce progressively aligned image cluster layouts.