GROVE visualizes distributions of language model generations as overlapping paths through a text graph, with user studies showing that graph summaries aid structural judgments like diversity assessment while raw outputs remain better for details.
Rzes- zotarski
9 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 9roles
background 3representative citing papers
NIRVANA supplies keystroke-level logs, complete ChatGPT dialogues, and copied content from 77 students to reconstruct AI-assisted essay writing and classify students into four behavioral profiles: Lead Authors, Collaborators, Drafters, and Vibe Writers.
WriteFlow is a voice-based AI system that scaffolds metacognitive regulation in academic writing by enabling iterative goal refinement, goal-text alignment, and evaluation of goal fulfillment, as demonstrated in user studies.
ANVIL automates analogy-based instructional animations for computer science by chaining LLM analogy generation, screenplay structuring, manim code production with repair, and mixed human-automated evaluations.
Narrix helps novices identify and reuse narrative strategies from examples through visualization and strategy-steered generation, improving retention, confidence, and adaptation over chat interfaces in a 12-person study.
A qualitative study with 22 creative writers finds that the reflective value of AI refusals depends on alignment with users' situational thinking phases, cognitive beliefs, and views of AI roles.
Shorter LLM response latencies reduce perceived output thoughtfulness and usefulness, while task type affects prompting frequency independently of latency.
A framework unifies multimodal intent interpretation, interaction-centric explainability, and agency-preserving controls as interdependent requirements for trustworthy Human-AI collaboration.
A literature review concludes that pursuing consensus in data annotation creates biased AI by dismissing subjective disagreements and enforcing geographic hegemony, and proposes mapping diversity instead.
citing papers explorer
-
Beyond One Output: Visualizing and Comparing Distributions of Language Model Generations
GROVE visualizes distributions of language model generations as overlapping paths through a text graph, with user studies showing that graph summaries aid structural judgments like diversity assessment while raw outputs remain better for details.
-
NIRVANA: A Comprehensive Dataset for Reproducing How Students Use Generative AI for Essay Writing
NIRVANA supplies keystroke-level logs, complete ChatGPT dialogues, and copied content from 77 students to reconstruct AI-assisted essay writing and classify students into four behavioral profiles: Lead Authors, Collaborators, Drafters, and Vibe Writers.
-
From Intention to Text: AI-Supported Goal Setting in Academic Writing
WriteFlow is a voice-based AI system that scaffolds metacognitive regulation in academic writing by enabling iterative goal refinement, goal-text alignment, and evaluation of goal fulfillment, as demonstrated in user studies.
-
ANVIL: Analogies and Videos for Lecturers
ANVIL automates analogy-based instructional animations for computer science by chaining LLM analogy generation, screenplay structuring, manim code production with repair, and mixed human-automated evaluations.
-
Narrix: Remixing Narrative Strategies from Examples for Story Writing
Narrix helps novices identify and reuse narrative strategies from examples through visualization and strategy-steered generation, improving retention, confidence, and adaptation over chat interfaces in a 12-person study.
-
Beyond Compliance: How AI Could Help Creative Writers by Refusing Them
A qualitative study with 22 creative writers finds that the reflective value of AI refusals depends on alignment with users' situational thinking phases, cognitive beliefs, and views of AI roles.
-
The Impact of Response Latency and Task Type on Human-LLM Interaction and Perception
Shorter LLM response latencies reduce perceived output thoughtfulness and usefulness, while task type affects prompting frequency independently of latency.
-
Toward a Unified Framework for Collaborative Design of Human-AI Interaction
A framework unifies multimodal intent interpretation, interaction-centric explainability, and agency-preserving controls as interdependent requirements for trustworthy Human-AI collaboration.
-
The Consensus Trap: Dissecting Subjectivity and the "Ground Truth" Illusion in Data Annotation
A literature review concludes that pursuing consensus in data annotation creates biased AI by dismissing subjective disagreements and enforcing geographic hegemony, and proposes mapping diversity instead.