IdeaBlocks modularizes divergent intents into Exploration Blocks with multi-level reuse options, enabling 2.13 times more images explored and 12.5% greater visual diversity than baseline in a comparative user study.
Title resolution pending
7 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
roles
background 4polarities
background 4representative citing papers
CanvasConvo presents a spatial canvas interface for branching LLM conversations, evaluated in a 5-7 day field study with 24 participants that found support for exploratory workflows.
A new benchmark shows LLM smartphone agents achieve comparable success with screen text alone as with screenshots, but both fail often due to UI accessibility and reasoning gaps.
CogInstrument represents human reasoning as revisable cognitive motifs in graphical form to support iterative alignment with LLMs during planning tasks, with a N=12 study indicating gains in targeted revision, agency, and trust over standard dialogue interfaces.
LDMDroid applies LLMs in a state-aware process to trigger data manipulation functions and uses visual cues to detect errors, finding 17 bugs across 24 Android apps with 14 developer confirmations.
Generative interfaces let LLMs create task-specific UIs that users prefer up to 72% more than standard chat responses across tested tasks.
A survey of 87 agents for computer use and 33 datasets that introduces a three-dimensional taxonomy across domain, interaction, and agent perspectives and identifies six research gaps.
citing papers explorer
-
IdeaBlocks: Expressing and Reusing Divergent Intents for Graphic Design Exploration using Generative AI
IdeaBlocks modularizes divergent intents into Exploration Blocks with multi-level reuse options, enabling 2.13 times more images explored and 12.5% greater visual diversity than baseline in a comparative user study.
-
Conversations in Space: Structuring Non-Linear LLM Interactions on a Canvas
CanvasConvo presents a spatial canvas interface for branching LLM conversations, evaluated in a 5-7 day field study with 24 participants that found support for exploratory workflows.
-
Do LLMs Need to See Everything? A Benchmark and Study of Failures in LLM-driven Smartphone Automation using Screentext vs. Screenshots
A new benchmark shows LLM smartphone agents achieve comparable success with screen text alone as with screenshots, but both fail often due to UI accessibility and reasoning gaps.
-
CogInstrument: Modeling Cognitive Processes for Bidirectional Human-LLM Alignment in Planning Tasks
CogInstrument represents human reasoning as revisable cognitive motifs in graphical form to support iterative alignment with LLMs during planning tasks, with a N=12 study indicating gains in targeted revision, agency, and trust over standard dialogue interfaces.
-
LDMDroid: Leveraging LLMs for Detecting Data Manipulation Errors in Android Apps
LDMDroid applies LLMs in a state-aware process to trigger data manipulation functions and uses visual cues to detect errors, finding 17 bugs across 24 Android apps with 14 developer confirmations.
-
Generative Interfaces for Language Models
Generative interfaces let LLMs create task-specific UIs that users prefer up to 72% more than standard chat responses across tested tasks.
-
A Comprehensive Survey of Agents for Computer Use: Foundations, Challenges, and Future Directions
A survey of 87 agents for computer use and 33 datasets that introduces a three-dimensional taxonomy across domain, interaction, and agent perspectives and identifies six research gaps.