Exploring the context of course rankings on online academic forums

Bob Edmison; Daron Williams; Larry Cox II; Matthew Louvet; Taha Hassan

arxiv: 1907.05846 · v1 · pith:BHKCDUHQnew · submitted 2019-07-10 · 💻 cs.CY · cs.SI

Exploring the context of course rankings on online academic forums

Taha Hassan , Bob Edmison , Larry Cox II , Matthew Louvet , Daron Williams This is my paper

Pith reviewed 2026-05-24 23:26 UTC · model grok-4.3

classification 💻 cs.CY cs.SI

keywords course rankingsprofessor ratingsstudent biasGPA outcomesonline forumsacademic ratingscourse outcomesinstitutional analysis

0 comments

The pith

Student ratings on course forums show a bias toward courses with higher average GPAs.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper examines whether student ratings of professors on online forums carry an implicit bias toward courses that produce better measured outcomes. It draws on ranking data for more than ten thousand courses at Virginia Tech and twenty-five peer institutions, comparing student-reported GPAs against overall instructor scores and rating patterns from two popular academic forums. The analysis finds a discernible but complex link between higher GPAs and better professor ratings. A reader would care because these forums shape course selection and institutional views of teaching quality, so any outcome-linked tilt could distort both student decisions and evaluation systems.

Core claim

Ranking data from the forums reveal a bias toward course outcomes in the professor ratings registered by students, with experiments showing that higher student-reported GPAs correspond to higher overall instructor rankings in a discernible though complex manner.

What carries the argument

Comparison of student-reported GPA as a measure of course outcomes against overall professor rankings and rating disparity across the two forums.

If this is right

Ratings may reflect grade outcomes at least as much as instructional quality.
Students could be steered toward courses with higher average grades rather than those offering stronger learning.
Universities relying on forum data for teaching assessment receive signals partly shaped by grading patterns.
The complexity of the bias implies it may vary by course type or institutional context.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Aggregated forum scores may be less useful for comparing teaching effectiveness across courses of different typical difficulties.
Similar outcome-linked biases could exist in other rating platforms where success metrics are visible to raters.
Adjusting ratings for average GPA might produce a cleaner signal of perceived teaching quality for further study.

Load-bearing premise

Student-reported GPA differences across courses serve as a direct proxy for course outcomes that can expose rating bias without large confounding from difficulty, self-selection, or forum-specific habits.

What would settle it

Absence of a significant correlation between average GPA and average professor ratings after accounting for course level and department would undermine the reported bias.

Figures

Figures reproduced from arXiv: 1907.05846 by Bob Edmison, Daron Williams, Larry Cox II, Matthew Louvet, Taha Hassan.

read the original abstract

University students routinely use the tools provided by online course ranking forums to share and discuss their satisfaction with the quality of instruction and content in a wide variety of courses. Student perception of the efficacy of pedagogies employed in a course is a reflection of a multitude of decisions by professors, instructional designers and university administrators. This complexity has motivated a large body of research on the utility, reliability, and behavioral correlates of course rankings. There is, however, little investigation of the (potential) implicit student bias on these forums towards desirable course outcomes at the institution level. To that end, we examine the connection between course outcomes (student-reported GPA) and the overall ranking of the primary course instructor, as well as rating disparity by nature of course outcomes, based on data from two popular academic rating forums. Our experiments with ranking data about over ten thousand courses taught at Virginia Tech and its 25 SCHEV-approved peer institutions indicate that there is a discernible albeit complex bias towards course outcomes in the professor ratings registered by students.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper reports an association between higher student GPAs and better instructor ratings on course forums from a large institutional dataset, but the evidence does not isolate bias from confounders like difficulty or selection.

read the letter

The core finding is an observed link between student-reported GPAs and professor ratings across more than ten thousand courses at Virginia Tech and peer schools, described as a complex bias. The work draws on data from two rating forums and frames this as filling a gap in institutional-level analysis of rating patterns. That scale of data collection is the main concrete contribution here, and it is useful for anyone tracking how these forums actually behave in practice. The authors stick to observational correlations rather than claiming a new model or causal mechanism, which keeps the claims grounded in what the data can show directly. The dataset itself looks like a reasonable new resource for this narrow question. The main limitation is the treatment of GPA differences as a direct proxy for course outcomes that can reveal bias. The abstract gives no indication of controls for course difficulty, enrollment patterns, department norms, or instructor-specific effects, so the association could reflect students rating easier classes more highly for straightforward reasons rather than any implicit bias mechanism. Without those adjustments or robustness checks, the bias label stays interpretive. This paper is aimed at researchers who study course evaluation systems or online academic forums and want empirical patterns from real institutional data. Readers focused on causal questions or policy recommendations will find it preliminary. It is worth sending to peer review because the data volume and the question are substantive enough to merit referee input on the methods, even though the current analysis needs tightening on the controls.

Referee Report

1 major / 0 minor

Summary. The manuscript examines potential implicit bias in student ratings on online academic forums by analyzing the relationship between student-reported GPA (as a proxy for course outcomes) and instructor ratings. Using observational data from two popular forums covering over 10,000 courses at Virginia Tech and its 25 SCHEV-approved peer institutions, the paper concludes there is a discernible albeit complex bias towards course outcomes in the registered professor ratings.

Significance. If the central claim were substantiated through controls for confounders, the work would contribute to research on the reliability of student evaluations of teaching by identifying how course outcomes may influence forum ratings. The scale of the dataset across multiple institutions is a notable strength for observational analysis in this domain.

major comments (1)

[Abstract] Abstract: The claim that experiments indicate a 'discernible albeit complex bias' towards course outcomes provides no details on statistical controls, error bars, data cleaning, or handling of confounders such as course difficulty, enrollment selectivity, department norms, or instructor fixed effects. This omission is load-bearing because the central claim requires that GPA differences isolate bias rather than reflect legitimate factors (e.g., easier courses attracting higher ratings or self-selection), and without such conditioning the observed association cannot be attributed to bias.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive feedback. The major comment concerns the abstract's lack of methodological detail. We address this point below and will revise the manuscript accordingly.

read point-by-point responses

Referee: [Abstract] Abstract: The claim that experiments indicate a 'discernible albeit complex bias' towards course outcomes provides no details on statistical controls, error bars, data cleaning, or handling of confounders such as course difficulty, enrollment selectivity, department norms, or instructor fixed effects. This omission is load-bearing because the central claim requires that GPA differences isolate bias rather than reflect legitimate factors (e.g., easier courses attracting higher ratings or self-selection), and without such conditioning the observed association cannot be attributed to bias.

Authors: We agree the abstract is concise and omits key details. The full manuscript describes data from two forums (>10k courses), data cleaning to exclude incomplete or outlier entries, and regression models with department and institution fixed effects; standard errors and confidence intervals appear in results. We did not include instructor fixed effects or direct controls for course difficulty/enrollment selectivity (beyond GPA as proxy) due to data constraints. We will revise the abstract to summarize these controls, report that the association persists after institution/department conditioning, and clarify the observational/correlational nature of the findings rather than implying isolated causal bias. This makes the central claim more precise without overstatement. revision: yes

Circularity Check

0 steps flagged

No circularity: observational data analysis only

full rationale

The paper reports an empirical observational study correlating student-reported GPA with instructor ratings from online forums across Virginia Tech and peer institutions. No derivation chain, equations, fitted parameters presented as predictions, or self-citations that bear the load of the central claim exist in the provided text. The analysis relies on external data sources and standard statistical associations without any self-referential construction or reduction of results to inputs by definition.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

Empirical observational study; central claim rests on the validity of forum data as representative of student perceptions and on GPA as a suitable proxy for course outcomes. No new entities are introduced.

axioms (2)

domain assumption Student-reported GPA serves as a reliable proxy for course outcomes that can be compared directly to instructor ratings.
Invoked when linking GPA to ratings to detect bias (abstract).
domain assumption Ratings on the two forums reflect genuine student perceptions of instruction rather than forum-specific artifacts.
Required to interpret observed correlations as evidence of bias.

pith-pipeline@v0.9.0 · 5711 in / 1267 out tokens · 21550 ms · 2026-05-24T23:26:31.629237+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

correlation between the average student-reported GPA and overall instructor rating... one-way ANOVA (F-test, table II)

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

18 extracted references · 18 canonical work pages

[1]

[Online]

(2019) Rate my professors - review teachers and professors, school reviews, college campus ratings. [Online]. Available: http: //ratemyprofessors.com/

work page 2019
[2]

LoPresti, P

G. LoPresti, P. Gartlan, D. Donahoe, and M. Le. (2019) Koofers - professor ratings, practice exams and ﬂash cards. [Online]. Available: http://koofers.com/

work page 2019
[3]

“he will crush you like an academic ninja!

J. Kindred and S. N. Mohammed, ““he will crush you like an academic ninja!”: Exploring teacher ratings on ratemyprofessors. com,” Journal of Computer-Mediated Communication , vol. 10, no. 3, p. JCMC10314, 2005

work page 2005
[4]

Exploring students’ perspectives of college stem: An analysis of course rating websites

Y . Chang and S. W. Park, “Exploring students’ perspectives of college stem: An analysis of course rating websites.” International Journal of Teaching and Learning in Higher Education, vol. 26, no. 1, pp. 90–101, 2014

work page 2014
[5]

Rating ratemyprofessors. com: A comparison of online and ofﬁcial student evaluations of teaching,

M. J. Brown, M. Baillie, and S. Fraser, “Rating ratemyprofessors. com: A comparison of online and ofﬁcial student evaluations of teaching,” College Teaching, vol. 57, no. 2, pp. 89–92, 2009

work page 2009
[6]

Ratemyprofessors. com offers biased evaluations,

A. M. Legg and J. H. Wilson, “Ratemyprofessors. com offers biased evaluations,” Assessment & Evaluation in Higher Education , vol. 37, no. 1, pp. 89–97, 2012

work page 2012
[7]

What ratemyprofessors. com reveals about how and why students evaluate their professors: A glimpse into the student mind-set,

K. B. Hartman and J. B. Hunt, “What ratemyprofessors. com reveals about how and why students evaluate their professors: A glimpse into the student mind-set,” Marketing Education Review , vol. 23, no. 2, pp. 151–162, 2013

work page 2013
[8]

Students’ evaluations of university teaching: Dimen- sionality, reliability, validity, potential baises, and utility

H. W. Marsh, “Students’ evaluations of university teaching: Dimen- sionality, reliability, validity, potential baises, and utility.” Journal of educational psychology, vol. 76, no. 5, p. 707, 1984

work page 1984
[9]

Will teachers receive higher student evaluations by giving higher grades and less course work?

J. A. Centra, “Will teachers receive higher student evaluations by giving higher grades and less course work?” Research in Higher Education , vol. 44, no. 5, pp. 495–518, 2003

work page 2003
[10]

Identifying exemplary teachers and teaching: Evidence from student ratings,

K. A. Feldman, “Identifying exemplary teachers and teaching: Evidence from student ratings,” in The scholarship of teaching and learning in higher education: An evidence-based perspective . Springer, 2007, pp. 93–143

work page 2007
[11]

Trust and trustworthiness in social recommender systems,

T. Hassan and D. S. McCrickard, “Trust and trustworthiness in social recommender systems,” Companion Proceedings of the 2019 World Wide Web Conference (WWW ‘19 Companion) , May 2019

work page 2019
[12]

Expected grade covariation with student ratings of instruction: Individual versus class effects

S. A. Stumpf and R. D. Freedman, “Expected grade covariation with student ratings of instruction: Individual versus class effects.” Journal of Educational Psychology , vol. 71, no. 3, p. 293, 1979

work page 1979
[13]

K. A. Feldman, “The association between student ratings of speciﬁc instructional dimensions and student achievement: Reﬁning and extend- ing the synthesis of data from multisection validity studies,” Research in Higher education , vol. 30, no. 6, pp. 583–645, 1989

work page 1989
[14]

On bias in social reviews of university courses,

T. Hassan, “On bias in social reviews of university courses,” in Compan- ion Publication of the 10th ACM Conference on Web Science . ACM, 2019, pp. 11–14

work page 2019
[15]

[Online]

(2019) Peer Institutions and Comparisons - Virginia Tech. [Online]. Available: https://www.ir.vt.edu/data/peers.html

work page 2019
[16]

The carnegie classiﬁcation of institutions of higher education,

L. S. Shulman, “The carnegie classiﬁcation of institutions of higher education,” Menlo Park: Carnegie Publication , 2001

work page 2001
[17]

Kernel principal component analysis,

B. Sch ¨olkopf, A. Smola, and K.-R. M ¨uller, “Kernel principal component analysis,” in International conference on artiﬁcial neural networks . Springer, 1997, pp. 583–588

work page 1997
[18]

An accelerated chow and liu algorithm: ﬁtting tree distribu- tions to high dimensional sparse data,

M. Meila, “An accelerated chow and liu algorithm: ﬁtting tree distribu- tions to high dimensional sparse data,” 1999

work page 1999

[1] [1]

[Online]

(2019) Rate my professors - review teachers and professors, school reviews, college campus ratings. [Online]. Available: http: //ratemyprofessors.com/

work page 2019

[2] [2]

LoPresti, P

G. LoPresti, P. Gartlan, D. Donahoe, and M. Le. (2019) Koofers - professor ratings, practice exams and ﬂash cards. [Online]. Available: http://koofers.com/

work page 2019

[3] [3]

“he will crush you like an academic ninja!

J. Kindred and S. N. Mohammed, ““he will crush you like an academic ninja!”: Exploring teacher ratings on ratemyprofessors. com,” Journal of Computer-Mediated Communication , vol. 10, no. 3, p. JCMC10314, 2005

work page 2005

[4] [4]

Exploring students’ perspectives of college stem: An analysis of course rating websites

Y . Chang and S. W. Park, “Exploring students’ perspectives of college stem: An analysis of course rating websites.” International Journal of Teaching and Learning in Higher Education, vol. 26, no. 1, pp. 90–101, 2014

work page 2014

[5] [5]

Rating ratemyprofessors. com: A comparison of online and ofﬁcial student evaluations of teaching,

M. J. Brown, M. Baillie, and S. Fraser, “Rating ratemyprofessors. com: A comparison of online and ofﬁcial student evaluations of teaching,” College Teaching, vol. 57, no. 2, pp. 89–92, 2009

work page 2009

[6] [6]

Ratemyprofessors. com offers biased evaluations,

A. M. Legg and J. H. Wilson, “Ratemyprofessors. com offers biased evaluations,” Assessment & Evaluation in Higher Education , vol. 37, no. 1, pp. 89–97, 2012

work page 2012

[7] [7]

What ratemyprofessors. com reveals about how and why students evaluate their professors: A glimpse into the student mind-set,

K. B. Hartman and J. B. Hunt, “What ratemyprofessors. com reveals about how and why students evaluate their professors: A glimpse into the student mind-set,” Marketing Education Review , vol. 23, no. 2, pp. 151–162, 2013

work page 2013

[8] [8]

Students’ evaluations of university teaching: Dimen- sionality, reliability, validity, potential baises, and utility

H. W. Marsh, “Students’ evaluations of university teaching: Dimen- sionality, reliability, validity, potential baises, and utility.” Journal of educational psychology, vol. 76, no. 5, p. 707, 1984

work page 1984

[9] [9]

Will teachers receive higher student evaluations by giving higher grades and less course work?

J. A. Centra, “Will teachers receive higher student evaluations by giving higher grades and less course work?” Research in Higher Education , vol. 44, no. 5, pp. 495–518, 2003

work page 2003

[10] [10]

Identifying exemplary teachers and teaching: Evidence from student ratings,

K. A. Feldman, “Identifying exemplary teachers and teaching: Evidence from student ratings,” in The scholarship of teaching and learning in higher education: An evidence-based perspective . Springer, 2007, pp. 93–143

work page 2007

[11] [11]

Trust and trustworthiness in social recommender systems,

T. Hassan and D. S. McCrickard, “Trust and trustworthiness in social recommender systems,” Companion Proceedings of the 2019 World Wide Web Conference (WWW ‘19 Companion) , May 2019

work page 2019

[12] [12]

Expected grade covariation with student ratings of instruction: Individual versus class effects

S. A. Stumpf and R. D. Freedman, “Expected grade covariation with student ratings of instruction: Individual versus class effects.” Journal of Educational Psychology , vol. 71, no. 3, p. 293, 1979

work page 1979

[13] [13]

K. A. Feldman, “The association between student ratings of speciﬁc instructional dimensions and student achievement: Reﬁning and extend- ing the synthesis of data from multisection validity studies,” Research in Higher education , vol. 30, no. 6, pp. 583–645, 1989

work page 1989

[14] [14]

On bias in social reviews of university courses,

T. Hassan, “On bias in social reviews of university courses,” in Compan- ion Publication of the 10th ACM Conference on Web Science . ACM, 2019, pp. 11–14

work page 2019

[15] [15]

[Online]

(2019) Peer Institutions and Comparisons - Virginia Tech. [Online]. Available: https://www.ir.vt.edu/data/peers.html

work page 2019

[16] [16]

The carnegie classiﬁcation of institutions of higher education,

L. S. Shulman, “The carnegie classiﬁcation of institutions of higher education,” Menlo Park: Carnegie Publication , 2001

work page 2001

[17] [17]

Kernel principal component analysis,

B. Sch ¨olkopf, A. Smola, and K.-R. M ¨uller, “Kernel principal component analysis,” in International conference on artiﬁcial neural networks . Springer, 1997, pp. 583–588

work page 1997

[18] [18]

An accelerated chow and liu algorithm: ﬁtting tree distribu- tions to high dimensional sparse data,

M. Meila, “An accelerated chow and liu algorithm: ﬁtting tree distribu- tions to high dimensional sparse data,” 1999

work page 1999