Large language models for education and research: An empirical and user survey-based analysis

· 2025 · arXiv 2512.08057

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

representative citing papers

LLM-as-a-Judge for Human-AI Co-Creation: A Reliability-Aware Evaluation Framework for Coding

cs.SE · 2026-04-30 · unverdicted · novelty 5.0

LLM judges for human-AI coding co-creation show moderate performance (ROC-AUC 0.59) and low agreement, with co-creation success concentrating early in interactions.

citing papers explorer

Showing 1 of 1 citing paper.

LLM-as-a-Judge for Human-AI Co-Creation: A Reliability-Aware Evaluation Framework for Coding cs.SE · 2026-04-30 · unverdicted · none · ref 2
LLM judges for human-AI coding co-creation show moderate performance (ROC-AUC 0.59) and low agreement, with co-creation success concentrating early in interactions.

Large language models for education and research: An empirical and user survey-based analysis

fields

years

verdicts

representative citing papers

citing papers explorer