arxiv: 2604.12821 · v1 · submitted 2026-04-14 · 💻 cs.CY

Recognition: unknown

Detecting and Enhancing Intellectual Humility in Online Political Discourse

Samantha D'Alonzo , Rachel Chen , Weidong Zhang , Melody Yu , Jasmine Mangat , Ivory Yang , Weicheng Ma , Martin Saveski

show 2 more authors

Soroush Vosoughi Nabeel Gillani

Authors on Pith no claims yet

Pith reviewed 2026-05-10 14:16 UTC · model grok-4.3

classification 💻 cs.CY

keywords intellectual humilityonline discourseRedditpolitical discussionmachine learning classifierrandomized trialnudgespolarization

0 comments

The pith

Intellectual humility can be measured at scale and increased through interventions in online political discussions without reducing engagement.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper sets out to define and measure intellectual humility, the recognition of one's own intellectual limits, in Reddit political threads where it is often scarce. Researchers created a codebook to label dimensions of humility and arrogance in hundreds of posts, trained a classifier on those labels, and applied both to study real discussions. Observational results show that threads with higher humility tend to produce more of the same in later posts, while a randomized trial tested prompts that raised humility expressions across multiple contentious topics. This line of work matters because if humility can be detected and reliably prompted, it offers a route to less polarized online exchanges that still keep users participating.

Core claim

Using a codebook to annotate several hundred Reddit posts for intellectual humility and arrogance, the authors trained and validated a classifier that enables large-scale measurement. Observational analysis of political discussions revealed that environments with more or less humility tend to sustain similar levels in subsequent posts. A randomized control trial then showed that simple nudges can increase expressions of humility across different topics and conversational settings.

What carries the argument

A codebook that breaks intellectual humility and intellectual arrogance into annotatable dimensions, paired with a machine-learning classifier trained on labeled Reddit posts to detect these traits at scale.

Load-bearing premise

The codebook and resulting classifier capture the actual psychological trait of intellectual humility rather than merely surface patterns in the annotated sample.

What would settle it

A new experiment in which participants given the humility prompts show no measurable rise in willingness to consider opposing views or revise their positions when tested in follow-up discussions.

Figures

Figures reproduced from arXiv: 2604.12821 by Ivory Yang, Jasmine Mangat, Martin Saveski, Melody Yu, Nabeel Gillani, Rachel Chen, Samantha D'Alonzo, Soroush Vosoughi, Weicheng Ma, Weidong Zhang.

**Figure 3.** Figure 3: Results from the logistic regression attempting to [PITH_FULL_IMAGE:figures/full_fig_p019_3.png] view at source ↗

**Figure 4.** Figure 4: Non-Zero Coefficients from the L1 Logistic Regression for PerspectiveAPI feature selection. Error bars represent [PITH_FULL_IMAGE:figures/full_fig_p020_4.png] view at source ↗

**Figure 5.** Figure 5: The average results from 20 trials with our GPT [PITH_FULL_IMAGE:figures/full_fig_p027_5.png] view at source ↗

**Figure 6.** Figure 6: This diagram represents the lab-based experiment design flow. Randomization happens when participants enroll in [PITH_FULL_IMAGE:figures/full_fig_p030_6.png] view at source ↗

**Figure 7.** Figure 7: Questions used to gauge participants topic/stance [PITH_FULL_IMAGE:figures/full_fig_p030_7.png] view at source ↗

**Figure 9.** Figure 9: An example of the comment functionality and “So [PITH_FULL_IMAGE:figures/full_fig_p031_9.png] view at source ↗

**Figure 10.** Figure 10: The coefficients for the “Base” model shown in [PITH_FULL_IMAGE:figures/full_fig_p033_10.png] view at source ↗

**Figure 11.** Figure 11: Interactoins between the effects of the treatments [PITH_FULL_IMAGE:figures/full_fig_p033_11.png] view at source ↗

read the original abstract

Intellectual humility (IH)-a recognition of one's own intellectual limitations-can reduce polarization and foster more understanding across lines of difference. Yet little work explores how IH can be systematically defined, measured, evaluated, and enhanced in spaces that often lack it the most: online political discussions. In this paper, we seek to bridge these gaps by exploring two questions: 1) how might preexisting levels of IH influence future expressions of IH during online political discourse? and 2) can online interventions enhance IH across different political topics and conversational environments? To pursue these questions, we define a codebook characterizing different dimensions of IH and intellectual arrogance (IA) and have researchers use it to annotate several hundred Reddit posts, which we then use to develop and validate a classifier to support IH analysis at scale. These tools subsequently enable two key contributions: i) an observational data analysis of how IH varies across different political discussions on Reddit, which reveals that more/less IH environments tend to contain future posts of a similar nature, and ii) a randomized control trial evaluating strategies for nudging discussion participants to demonstrate more IH in their posts, which reveals the possibility of enhancing IH in online discussions across a range of contentious topics. Our findings highlight the possibility of measuring and increasing IH online without necessarily reducing engagement.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper delivers a new IH codebook, Reddit classifier, persistence analysis, and RCT on interventions, but the classifier's tie to the actual construct rests on thin validation.

read the letter

The core advance here is the purpose-built codebook for intellectual humility and arrogance, the large Reddit annotation effort, the trained classifier, the observational finding that IH levels persist across threads, and the RCT testing nudges that raise IH without cutting engagement. That combination of annotation, scaling, and live experiment on real political subreddits is more than most papers in this space manage to pull off in one go. The RCT across topics is the part that stands out as genuinely useful for platform work or polarization research. Credit for shipping concrete tools and an actual field test instead of stopping at correlations. The stress-test concern lands. The abstract gives no inter-annotator numbers, no classifier precision or recall, and no external check against other IH measures or different platforms. If the model is mainly picking up hedging phrases or uncertainty markers that lined up with the training labels, then both the persistence result and the intervention effects become statements about language patterns rather than the psychological trait. That gap makes the central claim—that IH can be measured and increased—harder to interpret at face value. The paper does not appear to invent entities or run circular arguments; the classifier is trained on human labels and the RCT compares against control. It reads as straightforward empirical work. This is for computational social science groups and platform researchers who want off-the-shelf ways to track or shift discourse quality. A reader already working on text-based measurement of traits or online experiments will get the most out of the codebook and RCT design. It is worth sending to peer review. The RCT and tooling give it enough substance that referees can usefully press on the validation details and effect sizes rather than reject outright.

Referee Report

3 major / 2 minor

Summary. The paper defines a codebook for dimensions of intellectual humility (IH) and intellectual arrogance (IA), has researchers annotate several hundred Reddit posts from political discussions, trains and validates a classifier on these annotations, performs an observational analysis showing that IH levels persist across posts in the same discussion environments, and conducts a randomized controlled trial testing interventions to increase IH expression without reducing engagement. The central claims are that preexisting IH influences future expressions and that online nudges can enhance IH across topics.

Significance. If the classifier and codebook validly operationalize the psychological construct of IH (rather than surface linguistic features), the work offers a scalable measurement tool and evidence that IH can be increased in contentious online spaces without engagement costs. The combination of observational persistence findings and an RCT intervention is a methodological strength for computational social science on polarization. However, the absence of reported validation metrics means the practical significance cannot yet be assessed.

major comments (3)

[Annotation and classifier validation] Annotation and classifier validation section: the manuscript states that researchers annotated several hundred Reddit posts to train and validate a classifier, but reports no inter-annotator agreement statistics (e.g., Cohen's kappa or Krippendorff's alpha) and no classifier performance metrics (precision, recall, F1, or confusion matrices). All subsequent observational and RCT results are interpreted through this classifier; without these numbers it is impossible to determine whether the tool captures the intended IH construct or merely annotator-specific lexical patterns in the Reddit sample.
[RCT evaluation] RCT section: the trial claims interventions enhance IH across topics and environments, yet the abstract and visible description provide no details on per-condition sample sizes, effect sizes, handling of post-hoc exclusions, or whether the classifier was applied blindly to intervention vs. control posts. These omissions are load-bearing because the enhancement claim rests entirely on classifier-assigned IH scores.
[Observational data analysis] Observational analysis section: the persistence result (more/less IH environments contain future posts of similar nature) is presented as evidence that IH levels influence future expressions, but without reporting the exact time windows, discussion-thread definitions, or robustness checks against topic confounds, it is unclear whether the pattern reflects genuine carry-over of IH or stable topic/participant effects.

minor comments (2)

[Abstract] The abstract would benefit from stating the exact number of annotated posts, the classifier's reported performance, and the RCT sample size to allow readers to gauge scale immediately.
[Codebook definition] Notation for IH vs. IA dimensions in the codebook could be clarified with an explicit table mapping each dimension to example annotations.

Simulated Author's Rebuttal

3 responses · 0 unresolved

Thank you for the detailed and constructive feedback on our manuscript. We appreciate the opportunity to clarify and strengthen our presentation of the methods and results. Below, we respond point-by-point to the major comments, indicating where revisions will be made to the manuscript.

read point-by-point responses

Referee: [Annotation and classifier validation] Annotation and classifier validation section: the manuscript states that researchers annotated several hundred Reddit posts to train and validate a classifier, but reports no inter-annotator agreement statistics (e.g., Cohen's kappa or Krippendorff's alpha) and no classifier performance metrics (precision, recall, F1, or confusion matrices). All subsequent observational and RCT results are interpreted through this classifier; without these numbers it is impossible to determine whether the tool captures the intended IH construct or merely annotator-specific lexical patterns in the Reddit sample.

Authors: We acknowledge that the current version of the manuscript does not include the inter-annotator agreement statistics or detailed classifier performance metrics. This was an oversight in our reporting. In the revised manuscript, we will add a dedicated subsection or expand the existing section to report Cohen's kappa (or Krippendorff's alpha) for the annotations, as well as precision, recall, F1 scores, and confusion matrices for the classifier. These metrics will help demonstrate that the classifier reliably captures the intended dimensions of intellectual humility rather than superficial patterns. revision: yes
Referee: [RCT evaluation] RCT section: the trial claims interventions enhance IH across topics and environments, yet the abstract and visible description provide no details on per-condition sample sizes, effect sizes, handling of post-hoc exclusions, or whether the classifier was applied blindly to intervention vs. control posts. These omissions are load-bearing because the enhancement claim rests entirely on classifier-assigned IH scores.

Authors: We agree that providing these details is crucial for evaluating the RCT results. In the revision, we will include the per-condition sample sizes, report effect sizes (e.g., Cohen's d or appropriate measures), describe how post-hoc exclusions were handled (if any), and explicitly state that the classifier was applied in a blinded manner to the posts from different conditions. We will also update the abstract to reference these key methodological details where space permits. revision: yes
Referee: [Observational data analysis] Observational analysis section: the persistence result (more/less IH environments contain future posts of similar nature) is presented as evidence that IH levels influence future expressions, but without reporting the exact time windows, discussion-thread definitions, or robustness checks against topic confounds, it is unclear whether the pattern reflects genuine carry-over of IH or stable topic/participant effects.

Authors: We will clarify these aspects in the revised manuscript. Specifically, we will specify the exact time windows used for the observational analysis, provide precise definitions of discussion threads, and include additional robustness checks (such as controlling for topic via fixed effects or topic modeling) to address potential confounds. This will strengthen the interpretation that the observed persistence reflects carry-over effects of IH levels. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical pipeline relies on external annotations and RCT design

full rationale

The paper's core contributions are an annotation codebook drawn from prior psychological literature, human labeling of Reddit posts, training of a downstream classifier on those labels, an observational analysis of persistence in IH levels, and an RCT testing interventions. None of these steps reduce a claimed prediction or result to its own fitted outputs by construction. The classifier is validated against held-out human annotations rather than self-generated labels, and the RCT compares treatment arms against a control using the classifier as a measurement tool. No self-citation chains, uniqueness theorems, or ansatzes are invoked to force the central claims. The work is self-contained against external benchmarks (human annotations and randomized assignment) and receives the default non-circularity finding.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claims rest on the assumption that human annotations using the new codebook constitute a valid ground truth for intellectual humility and that the RCT interventions causally affect expressed humility rather than merely changing surface language. No free parameters or invented entities are described in the abstract.

axioms (2)

domain assumption The codebook dimensions faithfully represent the psychological construct of intellectual humility.
Invoked when the authors use the codebook to annotate posts and train the classifier.
domain assumption The randomized prompts constitute a clean intervention that does not alter engagement or topic selection in ways that confound humility measurement.
Required for interpreting the RCT results as evidence of enhancement.

pith-pipeline@v0.9.0 · 5556 in / 1390 out tokens · 22269 ms · 2026-05-10T14:16:10.141402+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

10 extracted references · 3 canonical work pages · 1 internal anchor

[1]

BERTopic: Neural topic modeling with a class-based TF-IDF procedure

A survey of expert views on misinformation: Defini- tions, determinants, solutions, and future of the field.Har- vard Kennedy School Misinformation Review. Argyle, L. P.; Bail, C. A.; Busby, E. C.; Gubler, J. R.; Howe, T.; Rytting, C.; Sorensen, T.; and Wingate, D. 2023. Lever- aging AI for democratic discourse: Chat interventions can improve online polit...

work page internal anchor Pith review arXiv 2023
[2]

Jia, C.; Lam, M

Publisher: Annual Reviews. Jia, C.; Lam, M. S.; Mai, M. C.; Hancock, J.; and Bernstein, M. S. 2024. Embedding Democratic Values into Social Me- dia AIs via Societal Objective Functions.Proceedings of the ACM on Human-Computer Interaction, 8(CSCW1): 1–36. Jigsaw. 2017. Perspective API. Katsaros, M.; Yang, K.; and Fratamico, L. 2021. Reconsid- ering Tweets:...

work page arXiv 2024
[3]

Muradova, L.; and Arceneaux, K

The Impact of Generative AI on Social Media: An Experimental Study.arXiv preprint arXiv:2506.14295. Muradova, L.; and Arceneaux, K. 2022. Reflective political reasoning: Political disagreement and empathy.European Journal of Political Research, 61(3): 740–761. Niu, A.; Gao, C.; and Yu, C. 2025. The Influence of Intel- lectual Humility in External Successo...

work page doi:10.1111/spc3.12116 2022
[4]

Publisher: Public Library of Science

Modeling the emergence of affective polarization in the social media society.PLOS ONE, 16(10): e0258259. Publisher: Public Library of Science. van Loon, A.; Katta, S.; Bail, C.; Hillygus, S.; and V ol- fovsky, A. 2024. Designing Social Media to Promote Pro- ductive Political Dialogue on a New Research Platform. Vicario, D.; Bessi; Zollo; Petroni; Scala; C...

2024
[5]

For most authors... (a) Would answering this research question advance sci- ence without violating social contracts, such as violat- ing privacy norms, perpetuating unfair profiling, exac- erbating the socio-economic divide, or implying disre- spect to societies or cultures? Yes (b) Do your main claims in the abstract and introduction accurately reflect t...
[6]

Additionally, if your study involves hypotheses testing... (a) Did you clearly state the assumptions underlying all theoretical results? Yes (b) Have you provided justifications for all theoretical re- sults? Yes (c) Did you discuss competing hypotheses or theories that might challenge or complement your theoretical re- sults? Yes (d) Have you considered ...
[7]

(a) Did you state the full set of assumptions of all theoret- ical results? N/A (b) Did you include complete proofs of all theoretical re- sults? N/A

Additionally, if you are including theoretical proofs... (a) Did you state the full set of assumptions of all theoret- ical results? N/A (b) Did you include complete proofs of all theoretical re- sults? N/A
[8]

Additionally, if you ran machine learning experiments... (a) Did you include the code, data, and instructions needed to reproduce the main experimental results (ei- ther in the supplemental material or as a URL)? No — these will be finalized following anonymous peer review. (b) Did you specify all the training details (e.g., data splits, hyperparameters, ...
[9]

Additionally, if you are using existing assets (e.g., code, data, models) or curating/releasing new assets... (a) If your work uses existing assets, did you cite the cre- ators? Yes (b) Did you mention the license of the assets? N/A (c) Did you include any new assets in the supplemental material or as a URL? Yes (d) Did you discuss whether and how consent...
[10]

us vs. them

Additionally, if you used crowdsourcing or conducted re- search with human subjects... (a) Did you include the full text of instructions given to participants and screenshots? Yes (b) Did you describe any potential participant risks, with mentions of Institutional Review Board (IRB) ap- provals? Yes (c) Did you include the estimated hourly wage paid to pa...

2014