Understanding User Experience in Large Language Model Interactions
read the original abstract
In the rapidly evolving landscape of large language models (LLMs), most research has primarily viewed them as independent individuals, focusing on assessing their capabilities through standardized benchmarks and enhancing their general intelligence. This perspective, however, tends to overlook the vital role of LLMs as user-centric services in human-AI collaboration. This gap in research becomes increasingly critical as LLMs become more integrated into people's everyday and professional interactions. This study addresses the important need to understand user satisfaction with LLMs by exploring four key aspects: comprehending user intents, scrutinizing user experiences, addressing major user concerns about current LLM services, and charting future research paths to bolster human-AI collaborations. Our study develops a taxonomy of 7 user intents in LLM interactions, grounded in analysis of real-world user interaction logs and human verification. Subsequently, we conduct a user survey to gauge their satisfaction with LLM services, encompassing usage frequency, experiences across intents, and predominant concerns. This survey, compiling 411 anonymous responses, uncovers 11 first-hand insights into the current state of user engagement with LLMs. Based on this empirical analysis, we pinpoint 6 future research directions prioritizing the user perspective in LLM developments. This user-centered approach is essential for crafting LLMs that are not just technologically advanced but also resonate with the intricate realities of human interactions and real-world applications.
This paper has not been read by Pith yet.
Forward citations
Cited by 5 Pith papers
-
Robust Text Watermarking for Large Language Models via Dual Semantic Embeddings
DEW creates a robust watermark for LLM text by applying vector-space operations to dual embeddings and hiding the signal via key-seeded random projections, showing improved detection after paraphrasing and translation.
-
Measuring Distribution Shift in User Prompts and Its Effects on LLM Performance
The LENS framework applied to 192 real-world settings shows moderate natural prompt distribution shifts cause 73% average performance loss in deployed LLMs, especially across user groups and regions.
-
Robust Text Watermarking for Large Language Models via Dual Semantic Embeddings
DEW is a semantic watermarking method for LLMs that derives a robust signal from dual embeddings via vector-space algebra and pseudo-random projections, remaining detectable after paraphrasing and translation.
-
Learning to Act under Noise: Enhancing Agent Robustness via Noisy Environments
NoisyAgent trains LLM agents with controlled user and tool noise to improve robustness in stochastic environments while also boosting clean-benchmark performance.
-
An Empirical Study of Perceptions of General LLMs and Multimodal LLMs on Hugging Face
Hugging Face discussions show that access barriers, output quality, and setup complexity are the main user concerns for both general and multimodal LLMs.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.