A Comparative Study of Student Perspectives on Technical Writing Feedback Quality: Evaluating LLMs, SLMs, and Humans in Computer Science Topics

Bogdan Simion; Christopher Eaton; Michael Liut; Runlong Ye; Suqing Liu

arxiv: 2601.11541 · v2 · pith:KSHW5FTDnew · submitted 2025-12-01 · 💻 cs.HC · cs.AI· cs.CY

A Comparative Study of Student Perspectives on Technical Writing Feedback Quality: Evaluating LLMs, SLMs, and Humans in Computer Science Topics

Suqing Liu , Runlong Ye , Christopher Eaton , Bogdan Simion , Michael Liut This is my paper

classification 💻 cs.HC cs.AIcs.CY

keywords feedbackllmswhilewritingcommercialcomputerhumaninstructors

0 comments

read the original abstract

To address the scalability of feedback in computer science while mitigating the privacy and cost limitations of commercial Large Language Models (LLMs), this study evaluates a locally hosted Small Language Model (SLM). We deployed a quantized Llama-3.1, GPT-4, and human instructors across introductory programming (N=176), operating systems (N=80), and a writing seminar (N=7). Mixed-methods analysis of student perceptions reveals that while the local SLM matched commercial LLMs and was rated higher by students for readability and actionability in technical courses, human feedback remained more favoured for highly specialized writing tasks. We demonstrate that local SLMs offer a privacy-preserving, zero-marginal-cost alternative for foundational feedback, supporting a tiered pedagogical framework where AI handles structural guidance while instructors focus on high-level conceptual scaffolding.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

FOXGLOVE: Understanding Goal-Oriented and Anchored Writing Feedback from Experts and LLMs on Argumentative Essays
cs.CL 2026-06 unverdicted novelty 6.0

FOXGLOVE dataset of 2340 comments shows LLMs and instructors align on feedback goals and positions but diverge on sentence selection, with LLMs using more complex language and fewer questions and higher quality rating...