pith. sign in

arxiv: 2605.15848 · v1 · pith:HRG7JVXBnew · submitted 2026-05-15 · 💻 cs.HC · cs.CL

Conversations in Space: Structuring Non-Linear LLM Interactions on a Canvas

Pith reviewed 2026-05-20 18:55 UTC · model grok-4.3

classification 💻 cs.HC cs.CL
keywords non-linear conversationspatial canvasbranching treeLLM interfaceexploratory workflowconversational UIfield study
0
0 comments X

The pith

Non-linear branching on a spatial canvas lets users explore alternatives in LLM conversations without losing the linear chat view.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces CanvasConvo to overcome the limits of linear chat when using LLMs for ideation and analysis. It turns conversations into a branching tree placed on a canvas so users can split off what-if paths from any message and see them side by side. The system keeps a standard chat pane available for quick replies while adding timeline navigation, automatic tags, and reusable prompts to hold structure across long sessions. A five-to-seven-day field study with twenty-four participants showed that this non-linear layout supports more varied and exploratory work styles than straight-line chats.

Core claim

CanvasConvo replaces the single linear thread of an LLM chat with a branching conversation tree drawn on a spatial canvas. Branches can be created directly from any conversational content to develop parallel alternatives, and users can move between the canvas view and the familiar chat interface at any time. Supporting tools include timeline navigation for jumping across history, automatic tagging and summarization of branches, and context-aware controls such as goals and reusable prompts. The field study found that these non-linear structures enable exploratory workflows and different interaction patterns in LLM-based tasks.

What carries the argument

A branching conversation tree embedded in a spatial canvas that visualizes alternative paths while remaining connected to a standard chat interface.

If this is right

  • Users can pursue several alternative directions at once without losing earlier context or having to start over.
  • Long sessions become easier to review and resume through timeline controls and automatic summaries of branches.
  • Switching between linear chat and canvas views lets people choose the mode that fits the current stage of their work.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same canvas-plus-branching layout could be tested in collaborative settings where several people add branches to a shared conversation space.
  • Automated suggestions for new branches based on detected uncertainty in the conversation might further lower the effort needed to explore alternatives.
  • Educational tools could adopt the structure so learners can try out different reasoning paths from a single tutoring dialogue.

Load-bearing premise

The benefits observed come mainly from the ability to branch and view conversations spatially rather than from the extra tools like tagging or prompts that were also present.

What would settle it

A controlled experiment in which the same participants complete identical exploratory tasks once with a plain linear LLM chat and once with CanvasConvo, then compare time spent, number of distinct directions pursued, and reported ease of managing the session.

Figures

Figures reproduced from arXiv: 2605.15848 by Alperen Adatepe, Andreas Butz, Daniela Fernandes, Daniel Buschek, Rifat Mehreen Amin.

Figure 4
Figure 4. Figure 4: Participant ratings of perceived agency, authorship, [PITH_FULL_IMAGE:figures/full_fig_p009_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Participant ratings of the system’s support for [PITH_FULL_IMAGE:figures/full_fig_p009_5.png] view at source ↗
read the original abstract

Conversational interfaces powered by large language models (LLMs) are widely used for ideation and analysis, yet their linear structure limits exploration of alternatives and management of long-running interactions. We present CanvasConvo, a conversational interface concept that transforms linear chat into a branching conversation tree embedded in a spatial canvas. CanvasConvo enables users to explore what-if scenarios by branching directly from conversational content, supporting parallel development of alternative directions. These branches are visualized on a canvas while remaining integrated with a familiar chat interface, allowing users to switch between linear and non-linear interaction. Features such as timeline-based navigation, automatic tagging and summarization, and context-aware controls (e.g., goals, reusable prompts) support structured interaction and continuity. We evaluated CanvasConvo in a 5-7 day field study with 24 participants. Our findings highlight how non-linear conversational structures support exploratory workflows and different interactions in LLM-based work.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript presents CanvasConvo, an interface concept that converts linear LLM conversations into a branching tree structure embedded in a spatial canvas. It supports what-if exploration through direct branching, parallel development of alternatives, and includes features like timeline navigation, automatic tagging, summarization, and context-aware controls. The evaluation consists of a 5-7 day field study with 24 participants, from which the authors conclude that non-linear conversational structures support exploratory workflows and different interactions in LLM-based work.

Significance. The work addresses a relevant limitation in current LLM interfaces for tasks requiring exploration of alternatives. The spatial canvas approach combined with branching offers a novel way to manage complex interactions. If the field study findings are robust, this could inform future interface designs in human-AI interaction. The practical integration with familiar chat elements is a positive aspect.

major comments (2)
  1. [Evaluation] Evaluation section: The 5-7 day field study with 24 participants reports positive findings on exploratory workflows but lacks any baseline comparison to linear chat interfaces or quantitative metrics from interaction logs, such as branch creation rates or backtracking frequency. This makes it challenging to isolate the benefits of the non-linear canvas from novelty or general LLM usage effects.
  2. [Findings] Findings and Abstract: The central claim that non-linear structures support exploratory workflows rests on thematic interview data without reported controls for novelty effects, detailed methodology, or logged metrics; this leaves the attribution to the branching tree and spatial canvas only partially supported.
minor comments (2)
  1. [Abstract] Abstract: The abstract could more precisely indicate that findings are qualitative and note the absence of quantitative results or baseline comparisons.
  2. [Related Work] Related Work: Ensure comprehensive citation of prior non-linear or branching conversation tools to better position the contribution.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed and constructive comments on our manuscript. We address the major concerns regarding the evaluation and findings below, indicating where revisions will be made to strengthen the paper while maintaining the integrity of the field study design.

read point-by-point responses
  1. Referee: [Evaluation] Evaluation section: The 5-7 day field study with 24 participants reports positive findings on exploratory workflows but lacks any baseline comparison to linear chat interfaces or quantitative metrics from interaction logs, such as branch creation rates or backtracking frequency. This makes it challenging to isolate the benefits of the non-linear canvas from novelty or general LLM usage effects.

    Authors: We agree that a baseline comparison would help isolate interface-specific effects. Our study was designed as a longitudinal field study to observe authentic, multi-day usage in participants' own environments, which made a controlled baseline comparison logistically challenging without altering natural workflows. We did collect interaction logs, including data on branch creation, navigation via the timeline, and backtracking. In the revised manuscript, we will report quantitative summaries of these metrics (e.g., average branches per participant and backtracking frequency) and add them to the findings for triangulation with interview data. We will also expand the limitations section to explicitly discuss novelty effects and the trade-offs of field versus controlled study designs. This constitutes a partial revision. revision: partial

  2. Referee: [Findings] Findings and Abstract: The central claim that non-linear structures support exploratory workflows rests on thematic interview data without reported controls for novelty effects, detailed methodology, or logged metrics; this leaves the attribution to the branching tree and spatial canvas only partially supported.

    Authors: The primary data source is thematic analysis of post-study interviews, which is standard for exploratory HCI field studies. We will revise the methodology section to provide greater detail on the interview protocol, thematic coding process, and how themes were validated. We will incorporate relevant logged metrics into the findings to support the qualitative claims. For novelty effects, we will add a dedicated limitations paragraph acknowledging this potential influence while noting that the 5-7 day duration and specific examples of sustained exploratory behavior (e.g., parallel branch development for complex tasks) provide some mitigation. We maintain that the data supports the claims about exploratory workflows but will be more explicit about the limits of causal attribution. revision: partial

Circularity Check

0 steps flagged

No circularity: empirical evaluation rests on independent user observations

full rationale

The paper introduces CanvasConvo as a system concept and grounds its central claim in a 5-7 day field study with 24 participants whose reported experiences and thematic findings are presented as direct evidence. No equations, fitted parameters, predictions, or derivations appear in the abstract or described content. The evaluation does not reduce any result to a self-definition, self-citation chain, or input-by-construction step; the attribution of exploratory workflows to the non-linear canvas is offered as an empirical observation rather than a logical necessity derived from prior author work or internal fitting. This is the normal self-contained outcome for an HCI system paper whose load-bearing step is external participant data.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The paper relies on standard HCI domain assumptions about the value of visual spatial layouts for complex information and introduces the CanvasConvo system as a new entity without external falsifiable evidence beyond the described study.

axioms (1)
  • domain assumption Visual spatial representations help users manage and explore complex, branching conversations more effectively than linear text alone.
    Core premise invoked to justify the canvas and branching design in the abstract.
invented entities (1)
  • CanvasConvo interface no independent evidence
    purpose: To transform linear LLM chats into non-linear branching structures on a spatial canvas.
    New system concept proposed and evaluated in the paper.

pith-pipeline@v0.9.0 · 5698 in / 1266 out tokens · 49599 ms · 2026-05-20T18:55:56.985212+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

32 extracted references · 32 canonical work pages

  1. [1]

    Rifat Mehreen Amin, Oliver Hans Kühle, Daniel Buschek, and Andreas Butz

  2. [2]

    InProceedings of the Extended Abstracts of the CHI Conference on Human Factors in Computing Systems (CHI EA ’25)

    Composable Prompting Workspaces for Creative Writing: Exploration and Iteration Using Dynamic Widgets. InProceedings of the Extended Abstracts of the CHI Conference on Human Factors in Computing Systems (CHI EA ’25). Association for Computing Machinery, New York, NY, USA, Article 144, 11 pages. doi:10.1145/3706599.3720243

  3. [3]

    Glassman

    Ian Arawjo, Chelse Swoopes, Priyan Vaithilingam, Martin Wattenberg, and Elena L. Glassman. 2024. ChainForge: A Visual Toolkit for Prompt Engineer- ing and LLM Hypothesis Testing. InProceedings of the 2024 CHI Conference on Human Factors in Computing Systems(Honolulu, HI, USA)(CHI ’24). Asso- ciation for Computing Machinery, New York, NY, USA, Article 304,...

  4. [4]

    Grounded copilot: How programmers interact with code-generating models,

    Shraddha Barke, Michael B. James, and Nadia Polikarpova. 2023. Grounded Copilot: How Programmers Interact with Code-Generating Models.Proc. ACM Program. Lang.7, OOPSLA1, Article 78 (April 2023), 27 pages. doi:10.1145/3586030

  5. [5]

    John Brooke. 1996. SUS: A ’Quick’ and ’Dirty’ Usability Scale. InUsability Evalu- ation in Industry, Patrick W. Jordan, Bruce Thomas, Bernard A. Weerdmeester, and Ian Lyall McClelland (Eds.). Taylor and Francis, Chapter 21, 189–194

  6. [6]

    Daniela Fernandes, Steeven Villa, Salla Nicholls, Otso Haavisto, Daniel Buschek, Albrecht Schmidt, Thomas Kosch, Chenxinran Shen, and Robin Welsch. 2026. AI makes you smarter but none the wiser: The disconnect between performance and metacognition.Computers in Human Behavior175 (2026), 108779. doi:10. 1016/j.chb.2025.108779

  7. [7]

    Karahalios

    Tong Gao, Mira Dontcheva, Eytan Adar, Zhicheng Liu, and Karrie G. Karahalios

  8. [8]

    C., Ramani K., Cipra R

    DataTone: Managing Ambiguity in Natural Language Interfaces for Data Visualization. InProceedings of the 28th Annual ACM Symposium on User Interface Software & Technology(Charlotte, NC, USA)(UIST ’15). Association for Computing Machinery, New York, NY, USA, 489–500. doi:10.1145/2807442.2807478

  9. [9]

    Sandra G Hart and Lowell E Staveland. 1988. Development of NASA-TLX (Task Load Index): Results of empirical and theoretical research. InAdvances in psy- chology. Vol. 52. Elsevier, 139–183

  10. [10]

    Daphne Ippolito, Ann Yuan, Andy Coenen, and Sehmon Burnam. 2022. Creative Writing with an AI-Powered Writing Assistant: Perspectives from Professional Writers. doi:10.48550/ARXIV.2211.05030 Version Number: 1

  11. [11]

    Dow, and Haijun Xia

    Peiling Jiang, Jude Rayan, Steven P. Dow, and Haijun Xia. 2023. Graphologue: Exploring Large Language Model Responses with Interactive Diagrams. InPro- ceedings of the 36th Annual ACM Symposium on User Interface Software and Technology. ACM, San Francisco CA USA, 1–20. doi:10.1145/3586183.3606737

  12. [12]

    David Kirsh. 2010. Thinking with external representations.AI & SOCIETY25, 4 (Nov. 2010), 441–454. doi:10.1007/s00146-010-0272-8

  13. [13]

    Alghamdi, Tal August, Avinash Bhat, Madiha Zahrah Choksi, Senjuti Dutta, Jin L.C

    Mina Lee, Katy Ilonka Gero, John Joon Young Chung, Simon Buckingham Shum, Vipul Raheja, Hua Shen, Subhashini Venugopalan, Thiemo Wambsganss, David Zhou, Emad A. Alghamdi, Tal August, Avinash Bhat, Madiha Zahrah Choksi, Senjuti Dutta, Jin L.C. Guo, Md Naimul Hoque, Yewon Kim, Simon Knight, Seyed Parsa Neshaei, Antonette Shibani, Disha Shrivastava, Lila Shr...

  14. [14]

    Michael Xieyang Liu, Tongshuang Wu, Tianying Chen, Franklin Mingzhe Li, Aniket Kittur, and Brad A Myers. 2024. Selenite: Scaffolding Online Sensemak- ing with Comprehensive Overviews Elicited from Large Language Models. In Proceedings of the 2024 CHI Conference on Human Factors in Computing Systems (Honolulu, HI, USA)(CHI ’24). Association for Computing M...

  15. [15]

    Damien Masson, Sylvain Malacria, Géry Casiez, and Daniel Vogel. 2024. Direct- GPT: A Direct Manipulation Interface to Interact with Large Language Models. InProceedings of the 2024 CHI Conference on Human Factors in Computing Systems (Honolulu, HI, USA)(CHI ’24). Association for Computing Machinery, New York, NY, USA, Article 975, 16 pages. doi:10.1145/36...

  16. [16]

    Srishti Palani, Zijian Ding, Austin Nguyen, Andrew Chuang, Stephen MacNeil, and Steven P. Dow. 2021. CoNotate: Suggesting Queries Based on Notes Promotes Knowledge Discovery. InProceedings of the 2021 CHI Conference on Human Factors in Computing Systems(Yokohama, Japan)(CHI ’21). Association for Computing Machinery, New York, NY, USA, Article 726, 14 page...

  17. [17]

    Srishti Palani, Yingyi Zhou, Sheldon Zhu, and Steven P. Dow. 2022. InterWeave: Presenting Search Suggestions in Context Scaffolds Information Search and Synthesis. InProceedings of the 35th Annual ACM Symposium on User Interface Software and Technology(Bend, OR, USA)(UIST ’22). Association for Computing Machinery, New York, NY, USA, Article 93, 16 pages. ...

  18. [18]

    Dimitri Popolov, Michael Callaghan, and Paul Luker. 2000. Conversation space: visualising multi-threaded conversation. InProceedings of the Working Confer- ence on Advanced Visual Interfaces(Palermo, Italy)(A VI ’00). Association for Computing Machinery, New York, NY, USA, 246–249. doi:10.1145/345513.345330

  19. [19]

    Gonzalo Ramos, Napol Rachatasumrit, Jina Suh, Rachel Ng, and Christopher Meek. 2022. ForSense: Accelerating Online Research Through Sensemaking Integration and Machine Research Support.ACM Trans. Interact. Intell. Syst.12, 4, Article 30 (Nov. 2022), 23 pages. doi:10.1145/3532853

  20. [20]

    Mohi Reza, Nathan M Laundry, Ilya Musabirov, Peter Dushniku, Zhi Yuan “Michael” Yu, Kashish Mittal, Tovi Grossman, Michael Liut, Anastasia Kuzminykh, and Joseph Jay Williams. 2024. ABScribe: Rapid Exploration & Organization of Multiple Writing Variations in Human-AI Co-Writing Tasks using Large Language Models. InProceedings of the 2024 CHI Conference on ...

  21. [21]

    Ben Shneiderman. 1983. Direct manipulation: A step beyond programming languages.Computer16, 08 (1983), 57–69

  22. [22]

    Ben Shneiderman. 1996. The eyes have it: A task by data type taxonomy for in- formation visualizations. InProceedings 1996 IEEE symposium on visual languages. IEEE, 336–343

  23. [23]

    Sangho Suh, Meng Chen, Bryan Min, Toby Jia-Jun Li, and Haijun Xia. 2024. Luminate: Structured Generation and Exploration of Design Space with Large Language Models for Human-AI Co-Creation. InProceedings of the CHI Con- ference on Human Factors in Computing Systems(Honolulu, HI, USA)(CHI ’24). Association for Computing Machinery, New York, NY, USA, Articl...

  24. [24]

    Sangho Suh, Bryan Min, Srishti Palani, and Haijun Xia. 2023. Sensecape: En- abling Multilevel Exploration and Sensemaking with Large Language Models. InProceedings of the 36th Annual ACM Symposium on User Interface Software and Technology(San Francisco, CA, USA)(UIST ’23). Association for Computing Machinery, New York, NY, USA, Article 1, 18 pages. doi:10...

  25. [25]

    Lev Tankelevitch, Viktor Kewenig, Auste Simkute, Ava Elizabeth Scott, Advait Sarkar, Abigail Sellen, and Sean Rintel. 2024. The Metacognitive Demands and Opportunities of Generative AI. InProceedings of the CHI Conference on Human Factors in Computing Systems(Honolulu, HI, USA)(CHI ’24). Association for Computing Machinery, New York, NY, USA, Article 680,...

  26. [26]

    Tongshuang Wu, Ellen Jiang, Aaron Donsbach, Jeff Gray, Alejandra Molina, Michael Terry, and Carrie J Cai. 2022. PromptChainer: Chaining Large Language Model Prompts through Visual Programming. InExtended Abstracts of the 2022 CHI Conference on Human Factors in Computing Systems(New Orleans, LA, USA) (CHI EA ’22). Association for Computing Machinery, New Y...

  27. [27]

    Tongshuang Wu, Michael Terry, and Carrie Jun Cai. 2022. AI Chains: Transparent and Controllable Human-AI Interaction by Chaining Large Language Model Prompts. InProceedings of the 2022 CHI Conference on Human Factors in Computing Systems(New Orleans, LA, USA)(CHI ’22). Association for Computing Machinery, New York, NY, USA, Article 385, 22 pages. doi:10.1...

  28. [28]

    Ryan Yen and Jian Zhao. 2024. Memolet: Reifying the Reuse of User-AI Con- versational Memories. InProceedings of the 37th Annual ACM Symposium on User Interface Software and Technology(Pittsburgh, PA, USA)(UIST ’24). Asso- ciation for Computing Machinery, New York, NY, USA, Article 58, 22 pages. doi:10.1145/3654777.3676388

  29. [29]

    Ann Yuan, Andy Coenen, Emily Reif, and Daphne Ippolito. 2022. Wordcraft: Story Writing With Large Language Models. InProceedings of the 27th International Conference on Intelligent User Interfaces(Helsinki, Finland)(IUI ’22). Association for Computing Machinery, New York, NY, USA, 841–852. doi:10.1145/3490099. 3511105

  30. [30]

    Zamfirescu-Pereira, Richmond Y

    J.D. Zamfirescu-Pereira, Richmond Y. Wong, Bjoern Hartmann, and Qian Yang

  31. [31]

    InProceedings of the 2023 CHI Conference on Human Factors in Computing Systems(Hamburg, Germany)(CHI ’23)

    Why Johnny Can’t Prompt: How Non-AI Experts Try (and Fail) to Design LLM Prompts. InProceedings of the 2023 CHI Conference on Human Factors in Computing Systems(Hamburg, Germany)(CHI ’23). Association for Computing Machinery, New York, NY, USA, Article 437, 21 pages. doi:10.1145/3544548. 3581388

  32. [32]

    Zheng Zhang, Jie Gao, Ranjodh Singh Dhaliwal, and Toby Jia-Jun Li. 2023. VISAR: A Human-AI Argumentative Writing Assistant with Visual Programming and Rapid Draft Prototyping. InProceedings of the 36th Annual ACM Symposium on User Interface Software and Technology(San Francisco, CA, USA)(UIST ’23). Association for Computing Machinery, New York, NY, USA, A...