pith. sign in

arxiv: 2603.18053 · v2 · pith:YYPIJ2ZEnew · submitted 2026-03-17 · 💻 cs.SI · cs.CY· econ.GN· q-fin.EC· stat.ML

Auditing the Auditors: Does Community-based Moderation Get It Right?

Pith reviewed 2026-05-21 11:24 UTC · model grok-4.3

classification 💻 cs.SI cs.CYecon.GNq-fin.ECstat.ML
keywords community notescrowd-sourced moderationconsensus-based auditingstrategic conformitystability-weighted aggregationlatent factor modelsocial media content moderation
0
0 comments X

The pith

X's Community Notes auditing ties user eligibility to agreement with the final outcome, causing minority contributors to conform and reducing participation on controversial topics.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that consensus-based auditing in Community Notes creates incentives for strategic conformity, where contributors with minority views shift their ratings toward the majority to maintain eligibility. This effect is strongest on divisive content where independent signals are most valuable. The authors propose replacing agreement-based weighting with a two-stage process that first adjusts for content and contributor differences and then gives more influence to those whose past evaluations have stable residuals relative to a latent-factor model. If this holds, platforms could aggregate crowd labels more accurately while preserving participation from disagreeing users. A reader would care because many online platforms now rely on similar crowd moderation, and the auditing rule shapes whose voices survive.

Core claim

In X's Community Notes after September 2022, consensus-based auditing that conditions participation on agreement with the eventual aggregate leads to minority contributors' evaluations drifting toward the majority and their participation share declining on controversial topics. A behavioral model formalizes contributors trading private beliefs against expected penalties for disagreement. The proposed two-stage auditing and aggregation algorithm first accounts for differences across content and contributors, then weights each contributor by the stability of their residuals relative to the latent-factor model; contributors with consistently informative evaluations receive greater influence, as

What carries the argument

The two-stage stability-weighted aggregation algorithm that first adjusts for content and contributor differences then assigns influence according to how predictable a contributor's evaluations remain relative to the latent-factor model.

If this is right

  • Minority contributors maintain higher participation shares on topics with high disagreement.
  • Aggregate predictions of misleading content improve on data not used to fit the model.
  • Evaluations that consistently deviate from the majority but remain stable over time gain influence.
  • The system avoids reducing the weight of independent signals on controversial items.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Similar stability-based weighting could be tested on other platforms that currently reward agreement with final labels.
  • The method may preserve viewpoint diversity even when the true label is uncertain.
  • Live A/B deployment on Community Notes could measure whether minority participation rebounds after switching auditing rules.

Load-bearing premise

The stability of a contributor's past residuals relative to a latent-factor model serves as a reliable proxy for how informative their future evaluations will be, independent of whether they match the final consensus.

What would settle it

Re-running the two-stage algorithm on a held-out portion of Community Notes data and finding no improvement in out-of-sample predictive accuracy compared with consensus-based aggregation would falsify the performance claim.

Figures

Figures reproduced from arXiv: 2603.18053 by Christian Borgs, Jennifer Chayes, Karissa Huang, Yeganeh Alimohammadi.

Figure 1
Figure 1. Figure 1: Visualization of rater and note factor distribution shift over time. [PITH_FULL_IMAGE:figures/full_fig_p006_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Rolling Spearman correlation between rater-note factor dot product alignment and helpfulness ratings for the [PITH_FULL_IMAGE:figures/full_fig_p007_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Pre–post change in the share of notes with final status Helpful by controversy category around the Rating [PITH_FULL_IMAGE:figures/full_fig_p008_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: This figure shows the weekly mean squared error (MSE) for in-sample vs. out-of-sample predictions from the [PITH_FULL_IMAGE:figures/full_fig_p009_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Weekly out-of-sample (one-week-ahead) predictions for residuals estimated using matrix factorization vs. [PITH_FULL_IMAGE:figures/full_fig_p010_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: The top figure shows rater factor distribution shift for the early user cohort and the bottom figure shows rater [PITH_FULL_IMAGE:figures/full_fig_p023_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: The top figure shows rater factor distribution shift for the early user cohort and the bottom figure shows rater [PITH_FULL_IMAGE:figures/full_fig_p024_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: This figure shows a histogram of the difference in rater factor between Oct. 2022 and Jan. 2023 for raters in [PITH_FULL_IMAGE:figures/full_fig_p025_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: This plot visualizes the difference in rater factor distribution for early users compared to new users before and [PITH_FULL_IMAGE:figures/full_fig_p025_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: This figure shows the RDD analysis on the bimodality coefficient measured at weekly intervals for rater [PITH_FULL_IMAGE:figures/full_fig_p026_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: This figure shows the RDD analysis on the bimodality coefficient measured at weekly intervals for note [PITH_FULL_IMAGE:figures/full_fig_p026_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: Permutation test test statistic distribution for the difference in Spearman correlation (post minus pre [PITH_FULL_IMAGE:figures/full_fig_p027_12.png] view at source ↗
Figure 13
Figure 13. Figure 13: Permutation test test statistic distribution for the difference in logistic regression coefficients (post minus [PITH_FULL_IMAGE:figures/full_fig_p027_13.png] view at source ↗
Figure 14
Figure 14. Figure 14: Pre-post change in the share of notes with final status Helpful by controversy category using topic-based [PITH_FULL_IMAGE:figures/full_fig_p031_14.png] view at source ↗
Figure 15
Figure 15. Figure 15: Pre-post change in the share of notes with final status Helpful by controversy category using topic-based [PITH_FULL_IMAGE:figures/full_fig_p031_15.png] view at source ↗
Figure 16
Figure 16. Figure 16: Pre-post change in the share of notes with final status Helpful by controversy category using [PITH_FULL_IMAGE:figures/full_fig_p033_16.png] view at source ↗
read the original abstract

Online social platforms increasingly rely on crowd-sourced systems to label misleading content at scale, but these systems must both aggregate users' evaluations and decide whose evaluations to trust. To address the latter, many platforms audit users by rewarding agreement with the final aggregate outcome, a design we term consensus-based auditing. We analyze the consequences of this design in X's Community Notes, which in September 2022 adopted consensus-based auditing that ties users' eligibility for participation to agreement with the eventual platform outcome. We find evidence of strategic conformity: minority contributors' evaluations drift toward the majority and their participation share falls on controversial topics, where independent signals matter most. We formalize this mechanism in a behavioral model in which contributors trade off private beliefs against anticipated penalties for disagreement. Motivated by these findings, we propose a two-stage auditing and aggregation algorithm that weights contributors by the stability of their past residuals rather than by agreement with the majority. The method first accounts for differences across content and contributors, and then measures how predictable each contributor's evaluations are relative to the latent-factor model. Contributors whose evaluations are consistently informative receive greater influence in aggregation, even when they disagree with the prevailing consensus. In the Community Notes data, this approach improves out-of-sample predictive performance while avoiding penalization of disagreement.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The paper examines X's Community Notes system, documenting strategic conformity under consensus-based auditing: minority contributors' ratings drift toward the majority and their participation declines on controversial topics. It formalizes this in a behavioral model and proposes a two-stage algorithm that first fits a latent-factor model to account for content and contributor differences, then weights contributors by the stability of their residuals relative to that model. The central empirical claim is that this stability-weighted aggregator improves out-of-sample predictive performance on Community Notes data while avoiding penalization of disagreement.

Significance. If the two-stage method genuinely delivers out-of-sample gains without mechanically down-weighting informative minority signals, the result would be relevant for platform design of crowd-sourced moderation. The behavioral model and observational evidence on conformity provide a useful starting point, though the strength of the performance claim depends on resolving the data-splitting and model-fitting details.

major comments (3)
  1. [Abstract and method section] The description of the two-stage algorithm (abstract and §4) does not state whether the latent-factor model is estimated on the full panel (including the eventual consensus ratings used to define eligibility) or only on training data. If the former, residuals and stability scores will absorb majority-driven patterns, undermining both the 'avoids penalization of disagreement' claim and the reported out-of-sample improvement.
  2. [Empirical results section] The out-of-sample predictive performance comparison lacks explicit baseline specifications, train/test split details, and controls for post-hoc tuning of the stability threshold or weighting function. Without these, it is difficult to rule out that the reported gains are driven by in-sample fitting rather than genuine generalization.
  3. [§3] The behavioral model in §3 assumes contributors trade off private beliefs against anticipated penalties, but the empirical test of strategic conformity does not report robustness checks that separate selection into participation from changes in rating behavior conditional on participating.
minor comments (2)
  1. [Method] Notation for the latent-factor model and residual stability metric should be defined more explicitly with equations rather than prose descriptions.
  2. [Figures] Figure captions and axis labels for the conformity and participation results could be expanded to clarify the exact sample restrictions and time windows used.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for their constructive and detailed feedback. We address each major comment below with clarifications on our current implementation and commitments to revisions that improve transparency and robustness without altering the core claims.

read point-by-point responses
  1. Referee: [Abstract and method section] The description of the two-stage algorithm (abstract and §4) does not state whether the latent-factor model is estimated on the full panel (including the eventual consensus ratings used to define eligibility) or only on training data. If the former, residuals and stability scores will absorb majority-driven patterns, undermining both the 'avoids penalization of disagreement' claim and the reported out-of-sample improvement.

    Authors: The latent-factor model is estimated only on training data within each cross-validation fold, ensuring residuals and stability scores are computed without access to test-set outcomes or the final consensus ratings. This design prevents absorption of majority-driven patterns. We will revise the abstract and §4 to state this explicitly, add pseudocode for the procedure, and include a note confirming no information leakage from eligibility definitions. revision: yes

  2. Referee: [Empirical results section] The out-of-sample predictive performance comparison lacks explicit baseline specifications, train/test split details, and controls for post-hoc tuning of the stability threshold or weighting function. Without these, it is difficult to rule out that the reported gains are driven by in-sample fitting rather than genuine generalization.

    Authors: We will expand the empirical results section with explicit details on temporal train/test splits (e.g., 70/30 by note creation date), a full set of baselines (unweighted mean, agreement-weighted, and simple majority), and sensitivity analyses for the stability threshold and weighting function. All hyperparameters are selected via training-data cross-validation only; we will report these controls to demonstrate generalization. revision: yes

  3. Referee: [§3] The behavioral model in §3 assumes contributors trade off private beliefs against anticipated penalties, but the empirical test of strategic conformity does not report robustness checks that separate selection into participation from changes in rating behavior conditional on participating.

    Authors: Our current tests already document both participation decline and rating drift. We will add a robustness check limited to contributors active on both controversial and non-controversial notes, showing rating conformity persists conditional on continued participation. This directly addresses selection effects while preserving the behavioral model's interpretation. revision: partial

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper proposes a two-stage algorithm that first fits a latent-factor model to account for content and contributor differences, then weights by stability of past residuals relative to that model. It explicitly evaluates this on out-of-sample predictive performance in the Community Notes data and contrasts it with consensus-based auditing. No equations or descriptions in the abstract or described chain show the stability metric or performance gain reducing to a fit on the evaluation data itself, nor any self-definitional loop, self-citation load-bearing premise, or renaming of known results. The behavioral model of strategic conformity is presented as an independent empirical observation. The derivation chain remains self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

2 free parameters · 1 axioms · 0 invented entities

The central claims rest on the assumption that past residual stability predicts future informativeness and that the latent-factor model adequately captures content and rater differences without introducing bias.

free parameters (2)
  • latent-factor dimensionality
    Number of factors in the first-stage model; chosen to account for differences across content and contributors.
  • stability threshold or weighting function
    Parameter controlling how much influence is given to contributors with stable residuals.
axioms (1)
  • domain assumption Contributors trade off private beliefs against anticipated penalties for disagreement.
    Behavioral model invoked to explain observed drift toward majority.

pith-pipeline@v0.9.0 · 5779 in / 1289 out tokens · 39447 ms · 2026-05-21T11:24:19.630349+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

77 extracted references · 77 canonical work pages

  1. [1]

    Dahleh, Ilan Lobel, and Asuman Ozdaglar

    Daron Acemoglu, Munther A. Dahleh, Ilan Lobel, and Asuman Ozdaglar. Bayesian learning in social networks. The Review of Economic Studies, 78(4):1201–1236, 2011

  2. [2]

    Fast and slow learning from reviews

    Daron Acemoglu, Ali Makhdoumi, Azarakhsh Malekian, and Asu Ozdaglar. Fast and slow learning from reviews. Econometrica, 90(2):775–810, 2022

  3. [3]

    A model of online misinformation.Review of Economic Studies, 91(6):3117–3150, 2024

    Daron Acemoglu, Asuman Ozdaglar, and James Siderius. A model of online misinformation.Review of Economic Studies, 91(6):3117–3150, 2024

  4. [4]

    A. C. Aitken. Iv.—on least squares and linear combination of observations.Proceedings of the Royal Society of Edinburgh, 55:42–48, 1936

  5. [5]

    Birds of a feather don’t fact-check each other: Partisanship and the evaluation of news in twitter’s birdwatch crowdsourced fact-checking program

    Jennifer Allen, Cameron Martel, and David G Rand. Birds of a feather don’t fact-check each other: Partisanship and the evaluation of news in twitter’s birdwatch crowdsourced fact-checking program. InProceedings of the 2022 CHI Conference on Human Factors in Computing Systems, CHI ’22, New York, NY , USA, 2022. Association for Computing Machinery

  6. [6]

    Quantifying the impact of misinformation and vaccine- skeptical content on facebook.Science, 384(6699):eadk3451, 2024

    Jennifer Allen, Duncan J Watts, and David G Rand. Quantifying the impact of misinformation and vaccine- skeptical content on facebook.Science, 384(6699):eadk3451, 2024

  7. [7]

    I like it

    Xavier Amatriain, Josep Pujol, and Nuria Oliver. I like it... i like it not: Evaluating user ratings noise in recommender systems, 06 2009

  8. [8]

    Panel data models with interactive fixed effects.Econometrica, 77(4):1229–1279, 2009

    Jushan Bai. Panel data models with interactive fixed effects.Econometrica, 77(4):1229–1279, 2009

  9. [9]

    Banerjee

    Abhijit V . Banerjee. A Simple Model of Herd Behavior*.The Quarterly Journal of Economics, 107(3):797–817, August 1992. _eprint: https://academic.oup.com/qje/article-pdf/107/3/797/5298496/107-3-797.pdf

  10. [10]

    Zhang, Connie Moon Sehat, and Tanushree Mitra

    Md Momen Bhuiyan, Amy X. Zhang, Connie Moon Sehat, and Tanushree Mitra. Investigating differences in crowdsourced news credibility assessment: Raters, tasks, and expert criteria.Proc. ACM Hum.-Comput. Interact., 4(CSCW2), October 2020

  11. [11]

    Timing matters when correcting fake news.Proceedings of the National Academy of Sciences, 118(5):e2020043118, 2021

    Nadia M Brashier, Gordon Pennycook, Adam J Berinsky, and David G Rand. Timing matters when correcting fake news.Proceedings of the National Academy of Sciences, 118(5):e2020043118, 2021

  12. [12]

    Carroll and David Ruppert

    Raymond J. Carroll and David Ruppert. Robust estimation in heteroscedastic linear models.The Annals of Statistics, 10(2):429–441, 1982

  13. [13]

    Noisy matrix completion: Understanding statistical guarantees for convex relaxation via nonconvex optimization.SIAM journal on optimization, 30(4):3098– 3121, 2020

    Yuxin Chen, Yuejie Chi, Jianqing Fan, Cong Ma, and Yuling Yan. Noisy matrix completion: Understanding statistical guarantees for convex relaxation via nonconvex optimization.SIAM journal on optimization, 30(4):3098– 3121, 2020

  14. [14]

    Locking and unlocking the ability to write notes

    Community Notes Guide – X. Locking and unlocking the ability to write notes. https://communitynotes.x. com/guide/en/contributing/writing-ability, n.d. Accessed: 2025-08-30

  15. [15]

    Rating and writing impact

    Community Notes Guide – X. Rating and writing impact. https://communitynotes.x.com/guide/en/ contributing/writing-and-rating-impact, n.d. Accessed: 2025-08-30

  16. [16]

    Aggregation of consumer ratings: an application to yelp

    Weijia Dai, Ginger Jin, Jungmin Lee, and Michael Luca. Aggregation of consumer ratings: an application to yelp. com.Quantitative Marketing and Economics, 16(3):289–339, 2018

  17. [17]

    Maximum likelihood estimation of observer error-rates using the em algorithm.Journal of the Royal Statistical Society: Series C (Applied Statistics), 28(1):20–28, 1979

    Alexander Philip Dawid and Allan M Skene. Maximum likelihood estimation of observer error-rates using the em algorithm.Journal of the Royal Statistical Society: Series C (Applied Statistics), 28(1):20–28, 1979

  18. [18]

    Diffusion of community fact-checked misinformation on twitter

    Chiara Patricia Drolsbach and Nicolas Pröllochs. Diffusion of community fact-checked misinformation on twitter. Proceedings of the ACM on Human-Computer Interaction, 7(CSCW2):1–22, 2023

  19. [19]

    Community notes increase trust in fact-checking on social media.PNAS nexus, 3(7):pgae217, 2024

    Chiara Patricia Drolsbach, Kirill Solovev, and Nicolas Pröllochs. Community notes increase trust in fact-checking on social media.PNAS nexus, 3(7):pgae217, 2024

  20. [20]

    Naive herding in rich-information settings.American economic journal: microeconomics, 2(4):221–243, 2010

    Erik Eyster and Matthew Rabin. Naive herding in rich-information settings.American economic journal: microeconomics, 2(4):221–243, 2010

  21. [21]

    Springer Nature, 2022

    Boi Faltings and Goran Radanovic.Game theory for data science: Eliciting truthful information. Springer Nature, 2022

  22. [22]

    Uncertainty quantification for low-rank matrix completion with heterogeneous and sub-exponential noise

    Vivek Farias, Andrew A Li, and Tianyi Peng. Uncertainty quantification for low-rank matrix completion with heterogeneous and sub-exponential noise. InInternational Conference on Artificial Intelligence and Statistics, pages 1179–1189. PMLR, 2022

  23. [23]

    V ox populi, 1907

    Francis Galton. V ox populi, 1907. 16 APREPRINT- MARCH20, 2026

  24. [24]

    Can crowdchecking curb misinformation? evidence from community notes.Information Systems Research, 2025

    Yang Gao, Maggie Mengqing Zhang, and Huaxia Rui. Can crowdchecking curb misinformation? evidence from community notes.Information Systems Research, 2025

  25. [25]

    Strictly proper scoring rules, prediction, and estimation.Journal of the American Statistical Association, 102(477):359–378, 2007

    Tilmann Gneiting and Adrian E Raftery. Strictly proper scoring rules, prediction, and estimation.Journal of the American Statistical Association, 102(477):359–378, 2007

  26. [26]

    Incentives and truthful reporting in consensus-centric crowdsourcing

    Eric Horvitz. Incentives and truthful reporting in consensus-centric crowdsourcing. Technical report, Microsoft Research, 2012

  27. [27]

    Finding the wise and the wisdom in a crowd: Estimating underlying qualities of reviewers and items.American Economic Review, 111(3):1001–1024, 2021

    Matthew Jackson and Stephen Nei. Finding the wise and the wisdom in a crowd: Estimating underlying qualities of reviewers and items.American Economic Review, 111(3):1001–1024, 2021

  28. [28]

    Who checks the checkers? exploring source credibility in twitter’s community notes.arXiv preprint arXiv:2406.12444, 2024

    Uku Kangur, Roshni Chakraborty, and Rajesh Sharma. Who checks the checkers? exploring source credibility in twitter’s community notes.arXiv preprint arXiv:2406.12444, 2024

  29. [29]

    Iterative learning for reliable crowdsourcing systems.Advances in neural information processing systems, 24, 2011

    David Karger, Sewoong Oh, and Devavrat Shah. Iterative learning for reliable crowdsourcing systems.Advances in neural information processing systems, 24, 2011

  30. [30]

    Trustworthy human computation: a survey

    Hisashi Kashima, Satoshi Oyama, Hiromi Arai, and Junichiro Mori. Trustworthy human computation: a survey. Artificial Intelligence Review, 57(12):322, 2024

  31. [31]

    Putting peer prediction under the micro (economic) scope and making truth-telling focal

    Yuqing Kong, Katrina Ligett, and Grant Schoenebeck. Putting peer prediction under the micro (economic) scope and making truth-telling focal. InInternational Conference on Web and Internet Economics, pages 251–264. Springer, 2016

  32. [32]

    Machine-learning aided peer prediction

    Yang Liu and Yiling Chen. Machine-learning aided peer prediction. InProceedings of the 2017 ACM Conference on Economics and Computation, EC ’17, page 63–80, New York, NY , USA, 2017. Association for Computing Machinery

  33. [33]

    How social influence can undermine the wisdom of crowd effect.Proceedings of the national academy of sciences, 108(22):9020–9025, 2011

    Jan Lorenz, Heiko Rauhut, Frank Schweitzer, and Dirk Helbing. How social influence can undermine the wisdom of crowd effect.Proceedings of the national academy of sciences, 108(22):9020–9025, 2011

  34. [34]

    Introducing community notes — adding context to posts

    Meta Platforms, Inc. Introducing community notes — adding context to posts. https://www.meta.com/technologies/community-notes/?srsltid= AfmBOoqGYuB01StOhwvVzji0toKNwMWsuS3OurkU7X3c5L2AvsifdBYC, 2025. Accessed: 2025-11-17

  35. [35]

    Eliciting informative feedback: The peer-prediction method

    Nolan Miller, Paul Resnick, and Richard Zeckhauser. Eliciting informative feedback: The peer-prediction method. Management Science, 51(9):1359–1373, 2005

  36. [36]

    Linear regression for panel with unknown number of factors as interactive fixed effects.Econometrica, 83(4):1543–1579, 2015

    Hyungsik Roger Moon and Martin Weidner. Linear regression for panel with unknown number of factors as interactive fixed effects.Econometrica, 83(4):1543–1579, 2015

  37. [37]

    Social influence bias: A randomized experiment.Science, 341(6146):647–651, 2013

    Lev Muchnik, Sinan Aral, and Sean J Taylor. Social influence bias: A randomized experiment.Science, 341(6146):647–651, 2013

  38. [38]

    University of Chicago Press, 1974

    Elisabeth Noelle-Neumann.The Spiral of Silence: A Theory of Public Opinion. University of Chicago Press, 1974

  39. [39]

    Performative prediction

    Juan Perdomo, Tijana Zrnic, Celestine Mendler-Dünner, and Moritz Hardt. Performative prediction. InInterna- tional Conference on Machine Learning, pages 7599–7609. PMLR, 2020

  40. [40]

    Twitter expands its crowdsourced fact-checking program Birdwatch ahead of US midterms, September

    Sarah Perez. Twitter expands its crowdsourced fact-checking program Birdwatch ahead of US midterms, September

  41. [41]

    Accessed: 2025-09-16

  42. [42]

    Twitter is making its crowdsourced fact-checks visible to all U.S

    Sarah Perez. Twitter is making its crowdsourced fact-checks visible to all U.S. users with Birdwatch expansion, October 2022. Accessed: 2025-09-16

  43. [43]

    Bluesky adds ‘anti-toxicity’ tools and aims to integrate ‘a community notes-like’ feature in the future

    Sarah Perez. Bluesky adds ‘anti-toxicity’ tools and aims to integrate ‘a community notes-like’ feature in the future. TechCrunch, 2024. Accessed: 2025-11-17

  44. [44]

    A bayesian truth serum for subjective data.Science, 306(5695):462–466, 2004

    Drazen Prelec. A bayesian truth serum for subjective data.Science, 306(5695):462–466, 2004

  45. [45]

    Learning from crowds.Journal of machine learning research, 11(4), 2010

    Vikas C Raykar, Shipeng Yu, Linda H Zhao, Gerardo Hermosillo Valadez, Charles Florin, Luca Bogoni, and Linda Moy. Learning from crowds.Journal of machine learning research, 11(4), 2010

  46. [46]

    Republicans are flagged more often than democrats for sharing misinformation on x’s community notes.Proceedings of the National Academy of Sciences, 122(25):e2502053122, 2025

    Thomas Renault, Mohsen Mosleh, and David G Rand. Republicans are flagged more often than democrats for sharing misinformation on x’s community notes.Proceedings of the National Academy of Sciences, 122(25):e2502053122, 2025

  47. [47]

    The influence limiter: provably manipulation-resistant recommender systems

    Paul Resnick and Rahul Sami. The influence limiter: provably manipulation-resistant recommender systems. In Proceedings of the 2007 ACM Conference on Recommender Systems, RecSys ’07, page 25–32, New York, NY , USA, 2007. Association for Computing Machinery. 17 APREPRINT- MARCH20, 2026

  48. [48]

    Informed truthfulness in multi-task peer prediction

    Victor Shnayder, Arpit Agarwal, Rafael Frongillo, and David C Parkes. Informed truthfulness in multi-task peer prediction. InProceedings of the 2016 ACM Conference on Economics and Computation, pages 179–196, 2016

  49. [49]

    Community notes reduce engagement with and diffusion of false information online.Proceedings of the National Academy of Sciences, 122(38):e2503413122, 2025

    Isaac Slaughter, Axel Peytavin, Johan Ugander, and Martin Saveski. Community notes reduce engagement with and diffusion of false information online.Proceedings of the National Academy of Sciences, 122(38):e2503413122, 2025

  50. [50]

    Pathological outcomes of observational learning.Econometrica, 68(2):371–398, 2000

    Lones Smith and Peter Sørensen. Pathological outcomes of observational learning.Econometrica, 68(2):371–398, 2000

  51. [51]

    Weighted low-rank approximations

    Nathan Srebro and Tommi Jaakkola. Weighted low-rank approximations. InProceedings of the 20th international conference on machine learning (ICML-03), pages 720–727, 2003

  52. [52]

    Estimation and inference for unbalanced panel data models with interactive fixed effects.Journal of Econometrics, 255:106222, 2026

    Liangjun Su, Fa Wang, and Yiren Wang. Estimation and inference for unbalanced panel data models with interactive fixed effects.Journal of Econometrics, 255:106222, 2026

  53. [53]

    Vintage, 2005

    James Surowiecki.The wisdom of crowds. Vintage, 2005

  54. [54]

    Diverse perspectives can mitigate political bias in crowdsourced content moderation

    Jacob Thebault-Spieker, Sukrit Venkatagiri, Naomi Mine, and Kurt Luther. Diverse perspectives can mitigate political bias in crowdsourced content moderation. InProceedings of the 2023 ACM Conference on Fairness, Accountability, and Transparency, pages 1280–1291, 2023

  55. [55]

    TikTok Pte. Ltd. Rolling out tiktok footnotes in the u.s. https://newsroom.tiktok.com/ rolling-out-tiktok-footnotes-in-the-us?lang=en, 2025. Accessed: 2025-11-17

  56. [56]

    Community notes: Documentation and source code powering community notes

    Twitter, Inc. Community notes: Documentation and source code powering community notes. https://github. com/twitter/communitynotes, 2022

  57. [57]

    Generalized low rank models.Foundations and Trends in Machine Learning, 9(1):1–118, 2016

    Madeleine Udell, Corinne Horn, Reza Zadeh, and Stephen Boyd. Generalized low rank models.Foundations and Trends in Machine Learning, 9(1):1–118, 2016

  58. [58]

    Misinformation: susceptibility, spread, and interventions to immunize the public.Nature medicine, 28(3):460–467, 2022

    Sander Van Der Linden. Misinformation: susceptibility, spread, and interventions to immunize the public.Nature medicine, 28(3):460–467, 2022

  59. [59]

    Manipulation robustness of collaborative filtering.Management Science, 56(11):1911–1929, 2010

    Benjamin Van Roy and Xiang Yan. Manipulation robustness of collaborative filtering.Management Science, 56(11):1911–1929, 2010

  60. [60]

    Introduction to the non-asymptotic analysis of random matrices., 2012

    Roman Vershynin. Introduction to the non-asymptotic analysis of random matrices., 2012

  61. [61]

    Eugene Stanley, and Walter Quattrociocchi

    Michela Del Vicario, Alessandro Bessi, Fabiana Zollo, Fabio Petroni, Antonio Scala, Guido Caldarelli, H. Eugene Stanley, and Walter Quattrociocchi. The spreading of misinformation online.Proceedings of the National Academy of Sciences, 113(3):554–559, 2016

  62. [62]

    The spread of true and false news online.Science, 359(6380):1146– 1151, 2018

    Soroush V osoughi, Deb Roy, and Sinan Aral. The spread of true and false news online.Science, 359(6380):1146– 1151, 2018

  63. [63]

    Output agreement mechanisms and common knowledge

    Bo Waggoner and Yiling Chen. Output agreement mechanisms and common knowledge. InProceedings of the AAAI Conference on Human Computation and Crowdsourcing, volume 2, pages 220–226, 2014

  64. [64]

    West and Carl T

    Jevin D. West and Carl T. Bergstrom. Misinformation in and about science.Proceedings of the National Academy of Sciences, 118(15):e1912444117, 2021

  65. [65]

    Peer prediction without a common prior

    Jens Witkowski and David C Parkes. Peer prediction without a common prior. InProceedings of the 13th ACM Conference on Electronic Commerce, pages 964–981, 2012

  66. [66]

    Ranking notes

    X Community Notes Guide. Ranking notes. https://communitynotes.x.com/guide/en/ under-the-hood/ranking-notes, n.d. Accessed: 2025-08-30

  67. [67]

    About community notes on x

    X Corp. About community notes on x. https://help.x.com/en/using-x/community-notes, 2025. Ac- cessed: 2025-11-17

  68. [68]

    / Community Notes Guide

    X Corp. / Community Notes Guide. Note ranking algorithm. https://communitynotes.x.com/guide/en/ under-the-hood/ranking-notes, n.d. Accessed: 2025-08-05

  69. [69]

    Downloading data

    X (formerly Twitter) Community Notes Guide. Downloading data. https://communitynotes.x.com/guide/ en/under-the-hood/download-data, n.d. Accessed: 2025-08-30

  70. [70]

    topic":"<label>

    Dora Zhao, Diyi Yang, and Michael S. Bernstein. Mapping the spiral of silence: Surveying unspoken opinions in online communities.arXiv preprint arXiv:2502.00952, 2025. 18 APREPRINT- MARCH20, 2026 A Guide to the Appendix This Appendix has two goals. First, it provides the empirical implementation details and robustness analyses underlying the main-text res...

  71. [71]

    The user-side variables{(h u, fu)}U u=1 are independent of the note-side variables{(i n, gn)}N n=1

  72. [72]

    The noise variables ϵun are independent, mean-zero, σϵ-sub-Gaussian, and independent of all other latent variables (see e.g., [59] Definition 5.7)

  73. [73]

    Assumption 2(Conformity parameters).In addition to Assumption 1, assume: 1.m n are i.i.d

    Each entry(u, n)is observed independently with probabilityp∈(0,1), andU≍N. Assumption 2(Conformity parameters).In addition to Assumption 1, assume: 1.m n are i.i.d. draws from a bounded distribution on a compact interval with positive finite variance. 2.ρ n :=ρ(c n)is a weakly decreasing function inc n

  74. [74]

    Assume that¯ρN →¯ρ∈[0,1]a deterministic constant

    Let¯ρN =N −1P n ρn. Assume that¯ρN →¯ρ∈[0,1]a deterministic constant

  75. [75]

    For each noten, the random variablesm n andg n are independent. 36 APREPRINT- MARCH20, 2026 E.2 Private-Signal Reporting: Consistency of Note Helpfulness (Proof of Theorem 1) We first study the benchmark case in which contributors report their private signals, so the platform observes a noisy version of the latent signal matrix S=µ1 U 1⊤ N +h1 ⊤ N +1 U i⊤...

  76. [76]

    Now, we turn to proving upper bounds

    Combining this with the lower bound onP u,n x2 un established above, we obtain with high probability X u,n ωunx2 un ≥c ′ U N(δµ)2 +N∥δh∥ 2 2 +U∥δi∥ 2 2 +N∥δf∥ 2 2 +U∥δg∥ 2 2 −C∥δf∥ 2 2∥δg∥2 2 −C∥δf∥ 4 2 for some constantsc ′, C >0. Now, we turn to proving upper bounds. Recall that it remains to upper bound the terms X u,n ωuny2 un, X u,n ωunϵunxun, X u,n ...

  77. [77]

    SinceXis at most rank5, we have X u,n ωunϵunxun ≤ ∥Ω◦E∥ op · ∥X∥ ∗ ≤ √ 5∥Ω◦E∥ op · ∥X∥ F

    Then, X u,n ωuny2 un ≤ X u,n y2 un = X u,n (δfuδgn)2 ≤ ∥δf∥ 2 2∥δg∥2 2 ≤ M(δ) 2 U N . SinceXis at most rank5, we have X u,n ωunϵunxun ≤ ∥Ω◦E∥ op · ∥X∥ ∗ ≤ √ 5∥Ω◦E∥ op · ∥X∥ F . Here, we slightly abuse notation and let Ω denote the matrix with entries ωun. Using the fact that ∥X∥2 F ≲M(δ) , Young’s inequality, and sub-Gaussian errors from Assumption 1, we ...