A llm-powered automatic grading framework with human-level guidelines optimization,

· 2024 · arXiv 2410.02165

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

representative citing papers

Learnable Assessment Skills for LLM-based Automated Scoring: Rubric Construction via Iterative Optimization

cs.CL · 2026-05-28 · unverdicted · novelty 6.0

An iterative framework lets LLMs learn procedural assessment skills for rubric construction, improving automated scoring on all ten ASAP-SAS items and often exceeding expert rubrics while showing cross-item transfer.

"**Important** You should give me full credits!": Exploring Prompt Injection Attacks on LLM-Based Automatic Grading Systems

cs.CR · 2026-06-02 · unverdicted · novelty 5.0

LLM-based automatic grading systems are highly vulnerable to prompt injection attacks that force high scores regardless of answer quality, and existing defenses fail to mitigate them.

LLMs as Teaching Assistants for Mathematics Exam Grading: Reliability, and Practical Usability

cs.CY · 2026-06-01 · unverdicted · novelty 5.0

Liberal partial-credit prompting reduces question-level grading error for all six tested LLMs, with ChatGPT 5.5 Thinking (LIBERAL) achieving the lowest MAE of 1.87.

Creating and Evaluating K-12 GenAI Assessment Graders Through Context Engineering

cs.CY · 2026-05-08 · unverdicted · novelty 3.0

LLM graders achieve substantial human agreement on math and science MCAS items but vary on ELA, performing best as sources of formative narrative feedback rather than summative numerical scores.

citing papers explorer

Showing 4 of 4 citing papers after filters.

Learnable Assessment Skills for LLM-based Automated Scoring: Rubric Construction via Iterative Optimization cs.CL · 2026-05-28 · unverdicted · none · ref 10
An iterative framework lets LLMs learn procedural assessment skills for rubric construction, improving automated scoring on all ten ASAP-SAS items and often exceeding expert rubrics while showing cross-item transfer.
"**Important** You should give me full credits!": Exploring Prompt Injection Attacks on LLM-Based Automatic Grading Systems cs.CR · 2026-06-02 · unverdicted · none · ref 2
LLM-based automatic grading systems are highly vulnerable to prompt injection attacks that force high scores regardless of answer quality, and existing defenses fail to mitigate them.
LLMs as Teaching Assistants for Mathematics Exam Grading: Reliability, and Practical Usability cs.CY · 2026-06-01 · unverdicted · none · ref 11
Liberal partial-credit prompting reduces question-level grading error for all six tested LLMs, with ChatGPT 5.5 Thinking (LIBERAL) achieving the lowest MAE of 1.87.
Creating and Evaluating K-12 GenAI Assessment Graders Through Context Engineering cs.CY · 2026-05-08 · unverdicted · none · ref 174
LLM graders achieve substantial human agreement on math and science MCAS items but vary on ELA, performing best as sources of formative narrative feedback rather than summative numerical scores.

A llm-powered automatic grading framework with human-level guidelines optimization,

fields

years

verdicts

representative citing papers

citing papers explorer