Novices performed better and reported lower workload with GitHub Copilot than with human partners, but human partners produced more positive emotions and a smaller drop in retest performance after one week.
Title resolution pending
4 Pith papers cite this work. Polarity classification is still indexing.
years
2026 4verdicts
UNVERDICTED 4representative citing papers
EvoMQL uses iterative Draft-Refine-Optimize cycles with execution feedback to reach 76.6% accuracy on EAI and 83.1% on TEND benchmarks for natural language to MongoDB query generation.
TeamUp matches students to projects with semantic embeddings and builds diverse teams by modeling skill complementarity through embedding variance, outperforming traditional methods in a virtual test with higher match quality, better difficulty alignment, and greater team diversity.
Introduces L2-Bench benchmark for AI feedback in language education across six dimensions and identifies explainability pitfalls in AI-generated explanations that appear helpful but are flawed.
citing papers explorer
-
Fast and Forgettable: A Controlled Study of Novices' Performance, Learning, Workload, and Emotion in AI-Assisted and Human Pair Programming Paradigms
Novices performed better and reported lower workload with GitHub Copilot than with human partners, but human partners produced more positive emotions and a smaller drop in retest performance after one week.
-
Draft-Refine-Optimize: Self-Evolved Learning for Natural Language to MongoDB Query Generation
EvoMQL uses iterative Draft-Refine-Optimize cycles with execution feedback to reach 76.6% accuracy on EAI and 83.1% on TEND benchmarks for natural language to MongoDB query generation.
-
TeamUp: Semantic Project Matching and Team Formation for Learning at Scale
TeamUp matches students to projects with semantic embeddings and builds diverse teams by modeling skill complementarity through embedding variance, outperforming traditional methods in a virtual test with higher match quality, better difficulty alignment, and greater team diversity.
-
Ceci n'est pas une explication: Evaluating Explanation Failures as Explainability Pitfalls in Language Learning Systems
Introduces L2-Bench benchmark for AI feedback in language education across six dimensions and identifies explainability pitfalls in AI-generated explanations that appear helpful but are flawed.