LLM-based automatic grading systems are highly vulnerable to prompt injection attacks that force high scores regardless of answer quality, and existing defenses fail to mitigate them.
Towards llm-based autograd- ing for short textual answers,
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2verdicts
UNVERDICTED 2representative citing papers
Liberal partial-credit prompting reduces question-level grading error for all six tested LLMs, with ChatGPT 5.5 Thinking (LIBERAL) achieving the lowest MAE of 1.87.
citing papers explorer
-
"**Important** You should give me full credits!": Exploring Prompt Injection Attacks on LLM-Based Automatic Grading Systems
LLM-based automatic grading systems are highly vulnerable to prompt injection attacks that force high scores regardless of answer quality, and existing defenses fail to mitigate them.
-
LLMs as Teaching Assistants for Mathematics Exam Grading: Reliability, and Practical Usability
Liberal partial-credit prompting reduces question-level grading error for all six tested LLMs, with ChatGPT 5.5 Thinking (LIBERAL) achieving the lowest MAE of 1.87.