Qualitative study of 19 practitioners reveals ten LLM product evaluation practices and introduces the results-actionability gap as a key barrier to turning findings into improvements.
Title resolution pending
3 Pith papers cite this work. Polarity classification is still indexing.
years
2026 3representative citing papers
SABER combines self-prior with multi-trace PK and CK reasoning representations to estimate reliability beliefs and drive trust-or-abstain decisions in knowledge-conflict RAG, improving accuracy over baselines.
A graph learning framework turns heterogeneous 3D engineering data into physics-aware graphs processed by GNNs for CAE mode classification and CFD field prediction in automotive applications.
citing papers explorer
-
Results-Actionability Gap: Understanding How Practitioners Evaluate LLM Products in the Wild
Qualitative study of 19 practitioners reveals ten LLM product evaluation practices and introduces the results-actionability gap as a key barrier to turning findings into improvements.
-
Trust or Abstain? A Self-Aware RAG Approach
SABER combines self-prior with multi-trace PK and CK reasoning representations to estimate reliability beliefs and drive trust-or-abstain decisions in knowledge-conflict RAG, improving accuracy over baselines.
-
Toward Generalizable Graph Learning for 3D Engineering AI: Explainable Workflows for CAE Mode Shape Classification and CFD Field Prediction
A graph learning framework turns heterogeneous 3D engineering data into physics-aware graphs processed by GNNs for CAE mode classification and CFD field prediction in automotive applications.