Large-scale analysis of wild LLM chat logs finds that user interaction patterns stabilize quickly after initial use and correlate with long-term outcomes like retention, creating an agency paradox of limited exploration in unconstrained systems.
hub Canonical reference
InProceedings of the 2023 CHI Conference on Human Factors in Computing Systems(Hamburg, Germany)(CHI ’23)
Canonical reference. 82% of citing Pith papers cite this work as background.
hub tools
citation-role summary
citation-polarity summary
roles
background 11representative citing papers
Persona-driven workflow and interface improve automated and human-AI red-teaming of generative AI by incorporating diverse perspectives into adversarial prompt creation.
uxCUA is a trained computer use agent that assesses GUI usability more accurately than larger models by learning to prioritize and execute important user interactions on labeled interface datasets.
Point&Grasp probabilistically integrates pointing and grasp gestures for out-of-reach object selection in MR, trained on a new ORG dataset, and outperforms single-cue baselines in user studies.
GROVE visualizes distributions of language model generations as overlapping paths through a text graph, with user studies showing that graph summaries aid structural judgments like diversity assessment while raw outputs remain better for details.
GUI agents can transform live web interfaces in real-time via DOM manipulations to deliver contextual assistance directly within the application.
A program synthesis system models collaborative physical activities from narrated demonstrations as editable programs, enabling users to teach, inspect, and correct them, with a study showing 70% success in refining soccer tactics programs.
Evalet applies functional fragmentation to deliver fragment-level qualitative analysis of LLM evaluations, with a user study showing 48% more misalignment detections than holistic scoring.
IdeaBlocks modularizes divergent intents into Exploration Blocks with multi-level reuse options, enabling 2.13 times more images explored and 12.5% greater visual diversity than baseline in a comparative user study.
Babel is an efficient black-box jailbreaking framework that formalizes sparse safety attention heads via a mathematical obfuscation model and uses iterative distribution refinement to achieve higher attack success rates on models like GPT-4o and Claude-3-5-haiku with around 40 queries.
CanvasConvo presents a spatial canvas interface for branching LLM conversations, evaluated in a 5-7 day field study with 24 participants that found support for exploratory workflows.
A survey of 457 papers yields a six-dimensional design space for abstraction in interactive systems that reframes gulfs of execution and evaluation while articulating cognitive and design processes for bridging abstraction gaps.
Cripping AI is a proposed framework that dismantles ableist assumptions in AI by centering disabled ways of knowing and respecting disabled labor in co-creation.
A modular VR simulator supports four distinct micromobility vehicles on one hardware setup and a preliminary study finds unique riding experiences for each.
JARVIS delivers VLM-powered contextual AR guidance with state verification for cross-reality tasks, improving usability and success rates over baselines in a 14-person study.
A qualitative study with 22 creative writers finds that the reflective value of AI refusals depends on alignment with users' situational thinking phases, cognitive beliefs, and views of AI roles.
A critical incident technique study with 142 participants identifies mechanisms by which games create or block agender euphoria and supplies empirically grounded design criteria for gender-neutral play.
Adaptive Prompt Elicitation (APE) uses an information-theoretic framework to generate visual queries that elicit and compile user intent into better prompts for text-to-image models, showing improved alignment in benchmarks and a user study.
Polite chatbot feedback lowers psychological reactance and boosts behavioral intentions but lacks engagement, whereas verbal leakage heightens surprise and engagement at the expense of increased reactance.
AnimationDiff is a visual comparison tool that combines contextual scene viewing, overlay/side-by-side modes, filtering, and temporal lenses to help users select among generated 3D character animations.
OOPrompt reifies user intents into structured manipulable artifacts to enable modular and iterative prompting in LLM-based interactive systems.
Industry markets AI agents for orchestration, creation, and insight, but a usability study with 31 participants reveals users face challenges from capability misalignment and lack of meta-cognition in tools like Operator and Manus.
Language access managers express conditional optimism about AI implementations but emphasize strong risk awareness and the necessity of human oversight and value in translation services.
This survey discusses key components and challenges for Personal LLM Agents and reviews solutions for their capability, efficiency, and security.
citing papers explorer
-
Priming, Path-dependence, and Plasticity: Understanding the molding of user-LLM interaction and its implications from (many) chat logs in the wild
Large-scale analysis of wild LLM chat logs finds that user interaction patterns stabilize quickly after initial use and correlate with long-term outcomes like retention, creating an agency paradox of limited exploration in unconstrained systems.
-
PersonaTeaming: Supporting Persona-Driven Red-Teaming for Generative AI
Persona-driven workflow and interface improve automated and human-AI red-teaming of generative AI by incorporating diverse perspectives into adversarial prompt creation.
-
Training Computer Use Agents to Assess the Usability of Graphical User Interfaces
uxCUA is a trained computer use agent that assesses GUI usability more accurately than larger models by learning to prioritize and execute important user interactions on labeled interface datasets.
-
Point & Grasp: Flexible Selection of Out-of-Reach Objects Through Probabilistic Cue Integration
Point&Grasp probabilistically integrates pointing and grasp gestures for out-of-reach object selection in MR, trained on a new ORG dataset, and outperforms single-cue baselines in user studies.
-
Beyond One Output: Visualizing and Comparing Distributions of Language Model Generations
GROVE visualizes distributions of language model generations as overlapping paths through a text graph, with user studies showing that graph summaries aid structural judgments like diversity assessment while raw outputs remain better for details.
-
Beyond Chat and Clicks: GUI Agents for In-Situ Assistance via Live Interface Transformation
GUI agents can transform live web interfaces in real-time via DOM manipulations to deliver contextual assistance directly within the application.
-
Interactive Program Synthesis for Modeling Collaborative Physical Activities from Narrated Demonstrations
A program synthesis system models collaborative physical activities from narrated demonstrations as editable programs, enabling users to teach, inspect, and correct them, with a study showing 70% success in refining soccer tactics programs.
-
Evalet: Evaluating Large Language Models through Functional Fragmentation
Evalet applies functional fragmentation to deliver fragment-level qualitative analysis of LLM evaluations, with a user study showing 48% more misalignment detections than holistic scoring.
-
IdeaBlocks: Expressing and Reusing Divergent Intents for Graphic Design Exploration using Generative AI
IdeaBlocks modularizes divergent intents into Exploration Blocks with multi-level reuse options, enabling 2.13 times more images explored and 12.5% greater visual diversity than baseline in a comparative user study.
-
Babel: Jailbreaking Safety Attention via Obfuscation Distribution Optimized Sampling
Babel is an efficient black-box jailbreaking framework that formalizes sparse safety attention heads via a mathematical obfuscation model and uses iterative distribution refinement to achieve higher attack success rates on models like GPT-4o and Claude-3-5-haiku with around 40 queries.
-
Conversations in Space: Structuring Non-Linear LLM Interactions on a Canvas
CanvasConvo presents a spatial canvas interface for branching LLM conversations, evaluated in a 5-7 day field study with 24 participants that found support for exploratory workflows.
-
Making Abstraction Concrete: A Design Space and Interaction Model of Abstraction in Interactive Systems
A survey of 457 papers yields a six-dimensional design space for abstraction in interactive systems that reframes gulfs of execution and evaluation while articulating cognitive and design processes for bridging abstraction gaps.
-
Cripping AI: Reimagining AI Through Lived Disability Experiences
Cripping AI is a proposed framework that dismantles ableist assumptions in AI by centering disabled ways of knowing and respecting disabled labor in co-creation.
-
MicroVRide: Exploring 4-in-1 Virtual Reality Micromobility Simulator
A modular VR simulator supports four distinct micromobility vehicles on one hardware setup and a preliminary study finds unique riding experiences for each.
-
JARVIS: A Just-in-Time Augmented Reality VLM-Powered Instruction System for Cross-Reality Task Guidance
JARVIS delivers VLM-powered contextual AR guidance with state verification for cross-reality tasks, improving usability and success rates over baselines in a 14-person study.
-
Beyond Compliance: How AI Could Help Creative Writers by Refusing Them
A qualitative study with 22 creative writers finds that the reflective value of AI refusals depends on alignment with users' situational thinking phases, cognitive beliefs, and views of AI roles.
-
Radical Gender Neutrality: Agender Euphoria in Gaming and Play Experiences
A critical incident technique study with 142 participants identifies mechanisms by which games create or block agender euphoria and supplies empirically grounded design criteria for gender-neutral play.
-
Adaptive Prompt Elicitation for Text-to-Image Generation
Adaptive Prompt Elicitation (APE) uses an information-theoretic framework to generate visual queries that elicit and compile user intent into better prompts for text-to-image models, showing improved alignment in benchmarks and a user study.
-
Polite But Boring? Trade-offs Between Engagement and Psychological Reactance to Chatbot Feedback Styles
Polite chatbot feedback lowers psychological reactance and boosts behavioral intentions but lacks engagement, whereas verbal leakage heightens surprise and engagement at the expense of increased reactance.
-
AnimationDiff: A Visual Comparison Tool for Generated 3D Character Animations
AnimationDiff is a visual comparison tool that combines contextual scene viewing, overlay/side-by-side modes, filtering, and temporal lenses to help users select among generated 3D character animations.
-
OOPrompt: Reifying Intents into Structured Artifacts for Modular and Iterative Prompting
OOPrompt reifies user intents into structured manipulable artifacts to enable modular and iterative prompting in LLM-based interactive systems.
-
Why Johnny Can't Use Agents: Industry Aspirations vs. User Realities with AI Agents
Industry markets AI agents for orchestration, creation, and insight, but a usability study with 31 participants reveals users face challenges from capability misalignment and lack of meta-cognition in tools like Operator and Manus.
-
AI Technologies in Language Access: Attitudes Towards AI and the Human Value of Language Access Managers
Language access managers express conditional optimism about AI implementations but emphasize strong risk awareness and the necessity of human oversight and value in translation services.
-
Personal LLM Agents: Insights and Survey about the Capability, Efficiency and Security
This survey discusses key components and challenges for Personal LLM Agents and reviews solutions for their capability, efficiency, and security.
- MobiBench: Multi-Branch, Modular Benchmark for Mobile GUI Agents