pith. sign in

arxiv: 2508.08422 · v1 · pith:DDPU7TDEnew · submitted 2025-08-11 · ⚛️ physics.ed-ph

Multi-institutional assessment of Peer Instruction implementation and impacts using the Framework for Interactive Learning in Lectures

Pith reviewed 2026-05-21 22:44 UTC · model grok-4.3

classification ⚛️ physics.ed-ph
keywords Peer Instructionactive learningphysics educationclassroom observationstudent learning gainsmulti-institutional studyinteractive strategies
0
0 comments X

The pith

Instructors who blend interactive and vicarious interactive strategies in Peer Instruction achieve larger student learning gains.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The study examines how seven introductory physics instructors at six institutions implement Peer Instruction. Classroom videos and conceptual inventory data are analyzed with the FILL+ framework to sort activities into interactive ones such as clicker questions, vicarious interactive ones such as students asking questions, and non-interactive lecturing. Results indicate that instructors who combine both interactive types produce higher student learning gains than those who rely mainly on one type. This finding matters because it identifies a concrete adjustment instructors can make to active learning without abandoning the method.

Core claim

Using video data and conceptual inventory scores from multiple institutions, the analysis indicates that instructors employing both interactive strategies, such as clicker questions with peer discussion, and vicarious interactive strategies, such as individual students posing questions, produce larger student learning gains than those who primarily use only interactive or only vicarious interactive approaches.

What carries the argument

The Framework for Interactive Learning in Lectures (FILL+), which categorizes classroom activities into interactive, vicarious interactive, and non-interactive types to evaluate variations in Peer Instruction implementation.

If this is right

  • Student learning gains increase when Peer Instruction includes both direct interaction through clicker questions and vicarious interaction through student questions.
  • Implementation details of active learning, not just its presence, shape student outcomes.
  • The pattern holds across multiple institutions, supporting broader applicability.
  • Conceptual inventory scores can distinguish effects of specific strategy combinations.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Instructor training could focus on deliberately mixing the two interactive categories during class time.
  • The same classification approach might reveal useful patterns in other active learning methods or disciplines.
  • Measuring the exact balance or timing between the two strategies could identify an optimal mix for gains.

Load-bearing premise

That differences in student learning gains can be attributed primarily to the observed combination of interactive and vicarious interactive strategies rather than to unmeasured factors such as instructor experience, student population differences, or other teaching elements.

What would settle it

A follow-up study that assigns instructors to use only one strategy versus a deliberate mix and measures gains on identical conceptual inventories before and after the course.

Figures

Figures reproduced from arXiv: 2508.08422 by Adrienne L. Traxler, Eric Brewe, Ibukunoluwa Bukola, Justin Gambrell, Meagan Sundstrom, Olive Ross.

Figure 1
Figure 1. Figure 1: FIG. 1: (a) Effect sizes for concept inventory scores. Points [PITH_FULL_IMAGE:figures/full_fig_p004_1.png] view at source ↗
read the original abstract

Substantial research indicates that active learning methods improve student learning more than traditional lecturing. Accordingly, current studies aim to characterize and evaluate different instructors' implementations of active learning methods. Peer Instruction is one of the most commonly used active learning methods in undergraduate physics instruction and typically involves the use of classroom response systems (e.g., clickers) where instructors pose conceptual questions that students answer individually and/or in collaboration with nearby peers. Several research studies have identified that different instructors vary in the ways they implement Peer Instruction (e.g., the time they give students to answer a question and the time they spend explaining the correct answer); however, these studies only take place at a single institution and do not relate the implementation of Peer Instruction to student learning. In this study, we analyze variation in both the implementation and impacts of Peer Instruction. We use classroom video observations and conceptual inventory data from seven introductory physics instructors across six U.S. institutions. We characterize implementation using the Framework for Interactive Learning in Lectures (FILL+), which classifies classroom activities as interactive (e.g., clicker questions), vicarious interactive (e.g., individual students asking a question), or non-interactive (e.g., instructor lecturing). Our preliminary results suggest that instructors who use both interactive and vicarious interactive strategies may exhibit larger student learning gains than instructors who predominantly use only one of the two strategies.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The manuscript reports a multi-institutional observational study of Peer Instruction implementation across seven introductory physics instructors at six U.S. institutions. Classroom videos are coded using the Framework for Interactive Learning in Lectures (FILL+) to classify activities as interactive (e.g., clicker questions), vicarious interactive (e.g., student questions), or non-interactive. Conceptual inventory data are used to examine impacts, with the central preliminary claim that instructors combining both interactive and vicarious interactive strategies show larger student learning gains than those using predominantly one strategy.

Significance. If the central claim can be substantiated with appropriate controls and statistical detail, the work would contribute to explaining variation in active-learning outcomes and could guide more effective Peer Instruction implementations. The multi-institutional scope and use of a standardized coding framework (FILL+) are strengths that support generalizability and reproducibility.

major comments (2)
  1. Abstract: the preliminary results claim larger learning gains for the combined-strategy group but supply no sample sizes per instructor, no statistical controls for instructor experience or student demographics, no error bars, and no exclusion criteria. This directly undermines the attribution of gains to the interactive/vicarious-interactive mix rather than confounding factors.
  2. Abstract and implied results section: with only seven instructors total, any observed differences in conceptual-inventory gains are equally consistent with selection effects or unmeasured variables; the manuscript must demonstrate that these alternatives have been measured or ruled out for the central claim to hold.
minor comments (1)
  1. Abstract: the description of FILL+ categories would be clearer if a brief example of each (interactive, vicarious interactive, non-interactive) were included for readers unfamiliar with the framework.

Simulated Author's Rebuttal

2 responses · 1 unresolved

We thank the referee for their constructive and detailed feedback. We address the major comments point by point below, outlining revisions that will improve clarity and transparency while preserving the preliminary and observational character of the study.

read point-by-point responses
  1. Referee: Abstract: the preliminary results claim larger learning gains for the combined-strategy group but supply no sample sizes per instructor, no statistical controls for instructor experience or student demographics, no error bars, and no exclusion criteria. This directly undermines the attribution of gains to the interactive/vicarious-interactive mix rather than confounding factors.

    Authors: We agree these details are necessary for proper evaluation. In the revised manuscript we will expand the abstract to report overall and per-instructor student sample sizes, reference the error bars already present in the figures, and state the exclusion criteria used for conceptual-inventory data. We will also add an explicit statement that no statistical controls for instructor experience or student demographics were applied, owing to the observational design and small number of instructors, and will expand the limitations discussion accordingly. revision: yes

  2. Referee: Abstract and implied results section: with only seven instructors total, any observed differences in conceptual-inventory gains are equally consistent with selection effects or unmeasured variables; the manuscript must demonstrate that these alternatives have been measured or ruled out for the central claim to hold.

    Authors: We recognize that seven instructors preclude statistical ruling-out of selection effects or unmeasured variables. The revision will strengthen the explicit discussion of these limitations, present per-instructor data transparently, and qualify the central claim as preliminary and suggestive. We will not claim to have ruled out alternatives but will argue that the standardized FILL+ coding and multi-institutional scope still yield useful initial patterns for the field. revision: partial

standing simulated objections not resolved
  • Fully demonstrating that selection effects and unmeasured variables have been measured or ruled out, which is not feasible with the current observational data from only seven instructors.

Circularity Check

0 steps flagged

No circularity: purely empirical observational study with no derivations or self-referential logic.

full rationale

This is an observational study reporting preliminary patterns from classroom video coding (via FILL+) and conceptual inventory gains across N=7 instructors. The abstract and provided text contain no equations, no fitted parameters presented as predictions, no self-citation chains invoked as uniqueness theorems, and no ansatzes or renamings of results. The central suggestion about combined interactive/vicarious strategies rests on external data collection rather than any internal definition or construction that reduces the outcome to the inputs. Per the guidelines, an empirical paper self-contained against external benchmarks receives score 0 when no load-bearing step reduces by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim depends on the validity of the FILL+ framework for classifying activities and on the assumption that the seven-instructor sample captures meaningful variation in implementation impacts.

axioms (1)
  • domain assumption The Framework for Interactive Learning in Lectures (FILL+) provides a valid classification of classroom activities that relates to differences in student learning.
    Invoked to characterize implementation and link it to outcomes.

pith-pipeline@v0.9.0 · 5799 in / 1135 out tokens · 60438 ms · 2026-05-21T22:44:41.013211+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

  • IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear
    ?
    unclear

    Relation between the paper passage and the cited Recognition theorem.

    We characterize implementation using the Framework for Interactive Learning in Lectures (FILL+), which classifies classroom activities as interactive (e.g., clicker questions), vicarious interactive (e.g., individual students asking a question), or non-interactive (e.g., instructor lecturing).

  • IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear
    ?
    unclear

    Relation between the paper passage and the cited Recognition theorem.

    Our preliminary results suggest that instructors who use both interactive and vicarious interactive strategies may exhibit larger student learning gains than instructors who predominantly use only one of the two strategies.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

40 extracted references · 40 canonical work pages

  1. [1]

    Interactive: Activities involving interactions between students, between students and the instructor, or when a student interacts directly with the material (such as thinking about a clicker question individually)

  2. [2]

    Vicarious interactive: Activities where most of the stu- dents are following along with the discussion, but nei- ther actively participating nor passively listening

  3. [3]

    Very High Research Spending and Doctorate Production

    Non-interactive: Activities with no interaction; students are passively listening. 2 TABLE I: Summary of institution types, number of students enrolled (and number of students with matched concept inventory responses), and concept inventories for the participating instructors. R1 indicates “Very High Research Spending and Doctorate Production” and RCU ind...

  4. [4]

    Richard R. Hake. Interactive-engagement versus traditional methods: A six-thousand-student survey of mechanics test data for introductory physics courses. American Journal of Physics, 5 66(1):64–74, 1998

  5. [5]

    Raker, and Alexandra Lau

    Melissa Dancy, Charles Henderson, Naneh Apkarian, Estrella Johnson, Marilyne Stains, Jeffrey R. Raker, and Alexandra Lau. Physics instructors’ knowledge and use of active learning has increased over the last decade but most still lecture too much. Physical Review Physics Education Research , 20(1):010119, 2024

  6. [6]

    Carl E. Wieman. Large-scale comparison of science teaching methods sends clear message. Proceedings of the National Academy of Sciences, 111(23):8319–8320, 2014

  7. [7]

    Pedagogical practices and instructional change of physics faculty

    Melissa Dancy and Charles Henderson. Pedagogical practices and instructional change of physics faculty. American Journal of Physics, 78(10):1056–1063, 2010

  8. [8]

    Finkelstein

    Chandra Turpen and Noah D. Finkelstein. Not all interactive en- gagement is the same: Variations in physics professors’ imple- mentation of peer instruction. Physical Review Special Topics– Physics Education Research, 5(2):020101, 2009

  9. [9]

    Wood, Ross K

    Anna K. Wood, Ross K. Galloway, Robyn Donnelly, and Judy Hardy. Characterizing interactive engagement activities in a flipped introductory physics class.Physical Review Physics Ed- ucation Research, 12(1):010140, 2016

  10. [10]

    Andrews, Michael J

    Tessa M. Andrews, Michael J. Leonard, Clinton A. Colgrove, and Steven T. Kalinowski. Active learning not associated with student learning in a random sample of college biology courses. CBE-Life Sciences Education, 10(4):394–405, 2011

  11. [11]

    Peer Instruction: A User’s Manual

    Eric Mazur. Peer Instruction: A User’s Manual. Prentice Hall, 1997

  12. [12]

    Peer instruction: Results from a range of classrooms

    Adam P Fagen, Catherine H Crouch, and Eric Mazur. Peer instruction: Results from a range of classrooms. The Physics Teacher, 40(4):206–209, 2002

  13. [13]

    Peer instruction: Ten years of experience and results

    Catherine H Crouch and Eric Mazur. Peer instruction: Ten years of experience and results. American Journal of Physics , 69(9):970–977, 2001

  14. [14]

    Peer instruc- tion: From Harvard to the two-year college

    Nathaniel Lasry, Eric Mazur, and Jessica Watkins. Peer instruc- tion: From Harvard to the two-year college. American Journal of Physics, 76(11):1066–1069, 2008

  15. [15]

    Peer instruction enhanced student performance on qualitative problem-solving questions

    Mauricio J Giuliodori, Heidi L Lujan, and Stephen E DiCarlo. Peer instruction enhanced student performance on qualitative problem-solving questions. Advances in Physiology Education, 30(4):168–173, 2006

  16. [16]

    Peer instruction enhanced meaningful learning: Ability to solve novel problems

    Ronald N Cortright, Heidi L Collins, and Stephen E DiCarlo. Peer instruction enhanced meaningful learning: Ability to solve novel problems. Advances in Physiology Education, 29(2):107– 111, 2005

  17. [17]

    A Mechanics Baseline Test

    David Hestenes and Malcolm Wells. A Mechanics Baseline Test. The Physics Teacher, 30(3):159–166, 1992

  18. [18]

    Force Concept Inventory

    David Hestenes, Malcolm Wells, and Gregg Swackhamer. Force Concept Inventory. The Physics Teacher, 30(3):141–158, 1992

  19. [19]

    Experimental validation of the half-length Force Concept Inventory

    Jing Han, Kathleen Koenig, Lili Cui, Joseph Fritchman, Dan Li, Wanyi Sun, Zhao Fu, and Lei Bao. Experimental validation of the half-length Force Concept Inventory. Physical Review Physics Education Research, 12(2):020122, 2016

  20. [20]

    Differences in male/female response patterns on alternative-format versions of the Force Concept Inventory

    Laura McCullough and David Meltzer. Differences in male/female response patterns on alternative-format versions of the Force Concept Inventory. In Physics Education Re- search Conference Proceedings , PER Conference, Rochester, New York, July 25-26 2001

  21. [21]

    Barker, Stephanie V

    Marilyne Stains, Jordan Harshman, Megan K. Barker, Stephanie V . Chasteen, Renee Cole, Sue Ellen DeChenne- Peters, M. Kevin Eagan Jr, Joan M. Esson, Jennifer K. Knight, Frank A. Laski, et al. Anatomy of STEM teaching in North American universities. Science, 359(6383):1468–1470, 2018

  22. [22]

    Developing the FILL+ tool to reliably classify classroom practices using lecture recordings.Journal for STEM Education Research, pages 1–23, 2021

    George Kinnear, Steph Smith, Ross Anderson, Thomas Gant, Jill RD MacKay, Pamela Docherty, Susan Rhind, and Ross Gal- loway. Developing the FILL+ tool to reliably classify classroom practices using lecture recordings.Journal for STEM Education Research, pages 1–23, 2021

  23. [23]

    Quantitative methods in psychology: A power primer

    Jacob Cohen. Quantitative methods in psychology: A power primer. Psychological Bulletin, 112:1155–1159, 1992

  24. [24]

    Jessica Gurevitch and Larry V . Hedges. Statistical issues in ecological meta-analyses. Ecology, 80(4):1142–1149, 1999

  25. [25]

    Statistical power analysis for the behavioral sci- ences

    Jacob Cohen. Statistical power analysis for the behavioral sci- ences. routledge, 2013

  26. [26]

    Eric Burkholder, Cole Walsh, and N. G. Holmes. Examination of quantitative methods for analyzing data from concept inven- tories. Phys. Rev. Phys. Educ. Res., 16:010141, Jun 2020

  27. [27]

    Nissen, Robert M

    Jayson M. Nissen, Robert M. Talbot, Amreen Nasim Thomp- son, and Ben Van Dusen. Comparison of normalized gain and Cohen’s d for analyzing gains on concept inventories. Phys. Rev. Phys. Educ. Res., 14:010115, Mar 2018

  28. [28]

    and Robert M

    III Turner, Herbert M. and Robert M. Bernard. Calculating and synthesizing effect sizes. Contemporary issues in communica- tion science and disorders, 33(Spring):42–55, 2006

  29. [29]

    Smith, Francis H

    Michelle K. Smith, Francis H. M. Jones, Sarah L. Gilbert, and Carl E. Wieman. The Classroom Observation Protocol for Un- dergraduate STEM (COPUS): A new instrument to characterize university STEM classroom practices. CBE–Life Sciences Ed- ucation, 12(4):618–627, 2013

  30. [30]

    FILL+ Training Manual, Jan 2024

    Steph Smith, Ross Anderson, Thomas Gant, and George Kinn- ear. FILL+ Training Manual, Jan 2024

  31. [31]

    A guideline of selecting and report- ing intraclass correlation coefficients for reliability research

    Terry K Koo and Mae Y Li. A guideline of selecting and report- ing intraclass correlation coefficients for reliability research. Journal of Chiropractic Medicine, 15(2):155–163, 2016

  32. [32]

    Charac- terizing active learning environments in physics using network analysis and classroom observations

    Kelley Commeford, Eric Brewe, and Adrienne Traxler. Charac- terizing active learning environments in physics using network analysis and classroom observations. Physical Review Physics Education Research, 17(2):020136, November 2021

  33. [33]

    Does using active learning in thermodynamics lectures improve students’ conceptual un- derstanding and learning experiences? European Journal of Physics, 36(1):015020, 2014

    Helen Georgiou and MD Sharma. Does using active learning in thermodynamics lectures improve students’ conceptual un- derstanding and learning experiences? European Journal of Physics, 36(1):015020, 2014

  34. [34]

    Rudolph, Gina Brissenden, and Wayne M

    Edward E Prather, Alexander L. Rudolph, Gina Brissenden, and Wayne M. Schlingman. A national study assessing the teaching and learning of introductory astronomy. Part I. The effect of in- teractive instruction. American Journal of Physics , 77(4):320– 330, 2009

  35. [35]

    What we say is not what we do: Effective evaluation of faculty professional development programs

    Diane Ebert-May, Terry L Derting, Janet Hodder, Jennifer L Momsen, Tammy M Long, and Sarah E Jardeleza. What we say is not what we do: Effective evaluation of faculty professional development programs. BioScience, 61(7):550–558, 2011

  36. [36]

    Hora and Joseph J

    Matthew T. Hora and Joseph J. Ferrare. Remeasuring postsec- ondary teaching: How singular categories of instruction ob- scure the multiple dimensions of classroom practice. Journal of College Science Teaching, 43(3):36–41, 2014

  37. [37]

    Active learning increases student performance in sci- ence, engineering, and mathematics

    Scott Freeman, Sarah L Eddy, Miles McDonough, Michelle K Smith, Nnadozie Okoroafor, Hannah Jordt, and Mary Pat Wen- deroth. Active learning increases student performance in sci- ence, engineering, and mathematics. Proceedings of the Na- tional Academy of Sciences , 111(23):8410–8415, 2014

  38. [38]

    Brock Murdoch and Paul W. Guy. Active learning in small and large classes. Accounting Education, 11(3):271–282, 2002

  39. [39]

    Managing active learning processes in large first year physics classes: The advantages of an integrated approach

    Michael J Drinkwater, Deanne Gannaway, Karen Sheppard, Matthew J Davis, Margaret J Wegener, Warwick P Bowen, and Joel F Corney. Managing active learning processes in large first year physics classes: The advantages of an integrated approach. Teaching and Learning Inquiry, 2(2):75–90, 2014

  40. [40]

    Stoltzfus and Julie Libarkin

    Jon R. Stoltzfus and Julie Libarkin. Does the room matter? Ac- 6 tive learning in traditional and enhanced lecture spaces. CBE– Life Sciences Education, 15(4):ar68, 2016