Higher generative AI error rates reduce user reliance, but task difficulty does not significantly moderate this effect.
Title resolution pending
3 Pith papers cite this work. Polarity classification is still indexing.
years
2026 3verdicts
UNVERDICTED 3representative citing papers
LLM chat systems show large differences in reference quantity and quality, but users rarely click or engage with them.
LLM-based SE tools lack stable ground truth and deterministic outputs, making standard evaluation assumptions invalid and requiring new approaches for reliable assessment.
citing papers explorer
-
Effects of Generative AI Errors on User Reliance Across Task Difficulty
Higher generative AI error rates reduce user reliance, but task difficulty does not significantly moderate this effect.
-
Analyzing the Presentation, Content, and Utilization of References in LLM-powered Conversational AI Systems
LLM chat systems show large differences in reference quantity and quality, but users rarely click or engage with them.
-
Evaluation of LLM-Based Software Engineering Tools: Practices, Challenges, and Future Directions
LLM-based SE tools lack stable ground truth and deterministic outputs, making standard evaluation assumptions invalid and requiring new approaches for reliable assessment.