AICoFe: Implementation and Deployment of an AI-Based Collaborative Feedback System for Higher Education
Pith reviewed 2026-05-08 16:30 UTC · model grok-4.3
The pith
AICoFe combines multiple AI models with teacher review to turn inconsistent student peer feedback into coherent, actionable comments.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
AICoFe employs a modular architecture that orchestrates a multi-LLM pipeline using GPT-4.1-mini, Gemini 2.5 Flash, and Llama 3.1 to synthesize quantitative rubric data and qualitative observations into coherent, actionable feedback, supported by a teacher-in-the-loop mediation workflow where educators curate drafts through Learning Analytics dashboards and backed by a hybrid SQL and MongoDB infrastructure for version traceability.
What carries the argument
The multi-LLM pipeline that synthesizes rubric and observation data into feedback drafts, combined with the teacher-in-the-loop workflow that lets educators curate outputs via specialized dashboards.
If this is right
- Educators can manage peer feedback for larger classes while preserving consistency and quality.
- Students receive more uniform and useful comments that better support their ability to reflect on their work.
- The traceable hybrid database enables systematic analysis of feedback patterns across courses and iterations.
- The system provides a practical model for deploying AI assistance in education without removing teacher judgment.
Where Pith is reading between the lines
- The same mediation pattern could apply to other collaborative student activities that require quality control, such as group project evaluations.
- Adding direct links to individual student performance data might allow the pipeline to tailor drafts more precisely.
- Wider adoption could reduce differences in feedback quality that currently depend on instructor workload or experience.
Load-bearing premise
That the multi-LLM synthesis will reliably generate coherent and actionable drafts that teachers can curate effectively enough to produce better student reflection and learning outcomes than traditional peer feedback.
What would settle it
A controlled comparison in which students receiving AICoFe-curated feedback show no greater gains in critical reflection measures or assignment performance than students receiving unassisted peer feedback.
Figures
read the original abstract
Effective peer feedback is essential for developing critical reflection in higher education, yet its impact is often limited by the inconsistent quality of student-generated comments. This paper presents the implementation and deployment of AICoFe (AI-based Collaborative Feedback), a system designed to bridge this gap through a human-centered AI approach. We describe a modular architecture that orchestrates a multi-LLM pipeline, utilizing GPT-4.1-mini, Gemini 2.5 Flash, and Llama 3.1, to synthesize quantitative rubric data and qualitative observations into coherent, actionable feedback. Key to the system is a "teacher-in-the-loop" mediation workflow, where educators use specialized Learning Analytics dashboards to curate and refine AI-generated drafts before delivery. Furthermore, we detail the underlying data infrastructure, which employs a hybrid SQL and MongoDB strategy to ensure traceability and manage semi-structured feedback versions.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper presents the implementation and deployment of AICoFe, an AI-based collaborative feedback system for higher education. It describes a modular architecture that orchestrates a multi-LLM pipeline (GPT-4.1-mini, Gemini 2.5 Flash, and Llama 3.1) to synthesize quantitative rubric data and qualitative observations into coherent, actionable feedback. Central elements include a teacher-in-the-loop mediation workflow using specialized Learning Analytics dashboards for curation and refinement of AI drafts, along with a hybrid SQL/MongoDB data infrastructure to ensure traceability and manage semi-structured feedback versions.
Significance. If the described architecture and workflow perform as outlined, the work could provide a useful contribution to HCI and educational technology by demonstrating a practical human-centered integration of multiple LLMs with educator oversight. The detailed account of modular orchestration, dashboard mediation, and hybrid storage for versioned traceability offers reusable design patterns for systems aiming to improve peer feedback consistency in higher education settings.
major comments (1)
- [Abstract and full system description] Abstract and system description sections: The central claim that AICoFe 'bridges this gap' in peer feedback quality via the multi-LLM synthesis and teacher-in-the-loop curation is presented without any supporting evaluation data. The manuscript provides no pilot metrics, inter-rater reliability scores on feedback coherence, pre/post measures of student reflection, baseline comparisons, or teacher workload statistics, leaving the effectiveness assertion as an untested design hypothesis rather than a demonstrated result.
Simulated Author's Rebuttal
We thank the referee for the constructive review of our manuscript on the AICoFe system. We address the major comment below and will revise the paper accordingly to better scope its claims.
read point-by-point responses
-
Referee: [Abstract and full system description] Abstract and system description sections: The central claim that AICoFe 'bridges this gap' in peer feedback quality via the multi-LLM synthesis and teacher-in-the-loop curation is presented without any supporting evaluation data. The manuscript provides no pilot metrics, inter-rater reliability scores on feedback coherence, pre/post measures of student reflection, baseline comparisons, or teacher workload statistics, leaving the effectiveness assertion as an untested design hypothesis rather than a demonstrated result.
Authors: We agree that the manuscript contains no empirical evaluation data (e.g., pilot metrics, inter-rater reliability, pre/post reflection measures, or workload statistics) and therefore cannot demonstrate that the system bridges the gap in peer feedback quality. The paper's contribution is the detailed description of a modular multi-LLM architecture, teacher-in-the-loop curation workflow, and hybrid SQL/MongoDB storage for versioned traceability, offered as reusable design patterns for HCI and educational technology. We will revise the abstract and system description sections to frame AICoFe as an implemented system designed to address the identified challenges in peer feedback, rather than asserting demonstrated effectiveness. The revised text will explicitly note that empirical validation remains future work. This change aligns the claims with the manuscript's implementation-and-deployment focus. revision: yes
Circularity Check
No significant circularity; paper is purely descriptive of system design with no derivation chain.
full rationale
The manuscript is an implementation and deployment report that details a modular architecture, multi-LLM orchestration (GPT-4.1-mini, Gemini 2.5 Flash, Llama 3.1), teacher-in-the-loop curation via Learning Analytics dashboards, and hybrid SQL/MongoDB storage. No equations, fitted parameters, predictions, uniqueness theorems, or mathematical derivations are present. Claims about bridging gaps in peer feedback quality are presented as design motivations and architectural choices rather than results derived from prior steps or self-citations. The paper contains no load-bearing reductions where outputs equal inputs by construction, making it self-contained as a descriptive account.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
N.-F. Liu, D. Carless, Peer feedback: the learning element of peer assessment, Teaching in Higher Education 11 (2006) 279–290
work page 2006
-
[2]
K. J. Topping, Peer assessment, Theory into Practice 48 (2009) 20–27
work page 2009
-
[3]
E. V. Popta, M. Kral, G. Camp, R. L. Martens, P. R.-J. Simons, Exploring the value of peer feedback in online learning for the provider, Educational Research Review 20 (2017) 24–34
work page 2017
-
[4]
Y. Wei, D. Liu, Incorporating peer feedback in academic writing: a systematic review of benefits and challenges, Frontiers in Psychology 15 (2024) 1506725
work page 2024
- [5]
- [6]
-
[7]
A. P. Cavalcanti, A. Barbosa, R. Carvalho, F. Freitas, Y.-S. Tsai, D. Gašević, R. F. Mello, Auto- matic feedback in online learning environments: A systematic literature review, Computers and Education: Artificial Intelligence 2 (2021) 100027
work page 2021
-
[8]
M. Giannakos, R. Azevedo, P. Brusilovsky, M. Cukurova, Y. Dimitriadis, D. Hernandez-Leo, S. Järvelä, M. Mavrikis, B. Rienties, The promise and challenges of generative AI in education, Behaviour & Information Technology (2024) 1–27
work page 2024
-
[9]
C. D. Kloos, C. Alario-Hoyos, I. Estévez-Ayres, P. Callejo-Pinardo, M. A. Hombrados-Herrera, P. J. Muñoz-Merino, P. M. Moreno-Marcos, M. Muñoz-Organero, M. B. Ibáñez, How can generative AI support education?, in: 2024 IEEE Global Engineering Education Conference (EDUCON), IEEE, 2024, pp. 1–7
work page 2024
- [10]
-
[11]
A. Becerra, R. Cobos, Enhancing the professional development of engineering students through an ai-based collaborative feedback system, in: 2025 IEEE Global Engineering Education Conference (EDUCON), IEEE, 2025, pp. 1–9
work page 2025
- [12]
- [13]
-
[14]
I. U. Haq, M. Pifarré, E. Fraca, Novelty evaluation using sentence embedding models in open-ended cocreative problem-solving, International Journal of Artificial Intelligence in Education (2024) 1–28
work page 2024
- [15]
-
[16]
K. Verbert, E. Duval, J. Klerkx, S. Govaerts, J. L. Santos, Learning analytics dashboard applications, American Behavioral Scientist 57 (2013) 1500–1509
work page 2013
-
[17]
M. Navarro, A. Becerra, R. Daza, R. Cobos, A. Morales, J. Fierrez, Vaad: Visual attention analysis dashboard applied to e-learning, in: 2024 International Symposium on Computers in Education (SIIE), IEEE, 2024, pp. 1–6
work page 2024
-
[18]
A. Becerra, R. Cobos, Integrating eye-tracking and artificial intelligence for human-centered visual attention analytics in online learning, IE Comunicaciones: Revista Iberoamericana de Informática Educativa (2025) 21–32
work page 2025
-
[19]
A. Becerra, R. Daza, R. Cobos, A. Morales, M. Cukurova, J. Fierrez, M2lads: A system for generating multimodal learning analytics dashboards, in: 2023 IEEE 47th Annual Computers, Software, and Applications Conference (COMPSAC), IEEE, 2023, pp. 1564–1569
work page 2023
-
[20]
A. Becerra, R. Cobos, C. Lang, Enhancing online learning by integrating biosensors and multi- modal learning analytics for detecting and predicting student behaviour: a review, Behaviour & Information Technology (2025) 1–26
work page 2025
- [21]
-
[22]
T. Wan, Z. Chen, Exploring generative AI assisted feedback writing for students’ written responses to a physics conceptual question with prompt engineering and few-shot learning, Physical Review Physics Education Research 20 (2024) 010152
work page 2024
- [23]
- [24]
-
[25]
C.-L. Yang, A. Uhde, N. Yamashita, H. Kuzuoka, Understanding and supporting peer review using AI-reframed positive summary, in: Proceedings of the 2025 CHI Conference on Human Factors in Computing Systems, 2025, pp. 1–16
work page 2025
-
[26]
A. Becerra, Z. Mohseni, J. Sanz, R. Cobos, A generative ai-based personalized guidance tool for enhancing the feedback to mooc learners, in: 2024 IEEE Global Engineering Education Conference (EDUCON), IEEE, 2024, pp. 1–8
work page 2024
-
[27]
A. Becerra, R. Cobos, R. Daza, A multimodal dataset of student oral presentations with sensors and evaluation data, arXiv preprint arXiv:2601.07576 (2026)
-
[28]
E. J. Hu, Y. Shen, P. Wallis, Z. Allen-Zhu, Y. Li, S. Wang, L. Wang, W. Chen, LoRA: Low-rank adaptation of large language models, arXiv preprint arXiv:2106.09685 (2022)
work page internal anchor Pith review arXiv 2022
-
[29]
A. Becerra, D. Andres, P. Villegas, R. Daza, R. Cobos, Mosaic-f: A framework for enhancing students’ oral presentation skills through personalized feedback, in: Proceedings of the Learning Analytics Summer Institute Spain 2025 (CEUR Workshop Proceedings, Vol. 4148), 2025, pp. 1–10
work page 2025
-
[30]
A. Golrang, K. Sharma, Does feedback based on gaze and stress indicators help novice pro- grammers?, in: European Conference on Technology Enhanced Learning, Springer, 2025, pp. 198–213
work page 2025
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.