Who, Why, and How: Disentangling the Effects of Moderation Source, Context, and Language on Post-Removal Behavior

Emilio Ferrara; Lindsay Young; Marlon Twyman; Siyi Zhou

arxiv: 2605.16204 · v1 · pith:OSYSUEV4new · submitted 2026-05-15 · 💻 cs.CY

Who, Why, and How: Disentangling the Effects of Moderation Source, Context, and Language on Post-Removal Behavior

Siyi Zhou , Lindsay Young , Marlon Twyman , Emilio Ferrara This is my paper

Pith reviewed 2026-05-19 21:45 UTC · model grok-4.3

classification 💻 cs.CY

keywords content moderationredditbot moderationself-censorshipuser complianceviolation severitylinguistic strategies

0 comments

The pith

Bot moderation on Reddit produces higher compliance and lower self-censorship than human or modteam moderation.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper analyzes how moderator source, violation context, and removal message language jointly shape what users do after their content is taken down. It draws on more than eleven million Reddit moderation events to compare bots, individual humans, and moderation teams. The central finding is that bots achieve stronger compliance with less silent withdrawal, while team moderation increases self-censorship and violation severity reverses which linguistic tactics succeed.

Core claim

In a dataset of 11,795,036 moderation events across 9 million users, bot-moderated removals yield higher compliance and lower self-censorship than removals by humans or modteams. Modteam actions produce the largest withdrawal effects. Linguistic features such as elaborated explanations and direct address improve outcomes only for routine violations; for serious violations these same features increase withdrawal while prosocial and emotionally emphatic framing becomes most effective.

What carries the argument

Violation severity as a moderator of cue-based processing, tested inside an extension of the Human-AI Interaction Theory of Interactive Media Effects through probabilistic behavioral classification and regression on linguistic features extracted via PCA.

If this is right

Routine violations can be routed to bots to raise compliance rates without raising self-censorship.
Modteam interventions should be reserved for cases where institutional signaling is the goal rather than retention.
Removal messages for high-severity violations should favor prosocial framing and emotional emphasis over detailed explanations.
Moderation systems can become context-adaptive by letting violation severity select the linguistic strategy.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The compliance advantage of bots may extend to other platforms if their community structures resemble Reddit's subreddit model.
Hybrid designs that start with bot messages and escalate serious cases to humans could capture both efficiency and perceived legitimacy.
Long-term user retention on platforms might rise if self-censorship is lowered through calibrated moderation language.

Load-bearing premise

The large observational dataset lets researchers attribute differences in user compliance and withdrawal directly to moderator source and message language without major confounding from subreddit norms or moderator assignment choices.

What would settle it

A randomized experiment that assigns identical violations to bot, human, or team moderation while varying message language and then measures the fraction of users who post again versus those who reduce activity.

Figures

Figures reproduced from arXiv: 2605.16204 by Emilio Ferrara, Lindsay Young, Marlon Twyman, Siyi Zhou.

**Figure 2.** Figure 2: Mean probability of user behavior trajectory after moderated by different source for different [PITH_FULL_IMAGE:figures/full_fig_p009_2.png] view at source ↗

**Figure 7.** Figure 7: Distribution for difference of post frequency, log ratio of post frequency, and moderation [PITH_FULL_IMAGE:figures/full_fig_p021_7.png] view at source ↗

read the original abstract

Content moderation is a central mechanism through which platforms attempt to balance user engagement with community governance. Yet existing research has largely treated moderation as a uniform intervention, overlooking how moderator source, violation context, and linguistic style jointly shape user behavior. Drawing on the Human--AI Interaction Theory of Interactive Media Effects (HAII-TIME), this study examines how these three dimensions produce divergent post-moderation behavioral trajectories in a large-scale observational dataset of 11,795,036 moderation events across 9,285,410 users and 61,261 subreddits on Reddit (2021--2025). Using probabilistic behavioral classification, ANOVA, and OLS regression with PCA-derived linguistic features, we find that bot moderation consistently produces higher compliance and lower self-censorship than human or modteam moderation, challenging the assumption that human agency cues are inherently advantageous. Modteam moderation produces the strongest self-censorship effects, suggesting that institutional depersonalization is a meaningful driver of behavioral withdrawal. Violation severity emerges as a critical contingency: linguistic strategies effective in routine contexts -- elaborated explanation, community-scale appeals, direct personal address -- can backfire for serious violations, whereas prosocially framed and emotionally emphatic messages become most effective when stakes are highest. Of 480 linguistic interactions tested, 33 survive FDR correction. These findings extend HAII-TIME by introducing violation salience as a moderator of cue-based processing, and offer empirical grounding for context-adaptive moderation design.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. This paper analyzes a large observational dataset of 11,795,036 moderation events across 9,285,410 users and 61,261 subreddits on Reddit (2021-2025) to examine how moderator source (bot, human, modteam), violation context, and linguistic style jointly influence post-moderation user behavior. Drawing on HAII-TIME, it employs probabilistic behavioral classification, ANOVA, and OLS regression with PCA-derived linguistic features, reporting that bot moderation is associated with higher compliance and lower self-censorship than human or modteam moderation, that modteam moderation drives the strongest self-censorship, and that violation severity moderates the effectiveness of linguistic strategies (with 33 of 480 interactions surviving FDR correction). The work claims to extend HAII-TIME by introducing violation salience as a moderator of cue-based processing.

Significance. If the central associations hold after addressing potential confounding, the findings would be significant for computational social science and platform governance research by providing large-scale evidence on differential effects of automated versus human moderation and by identifying violation severity as a key contingency for linguistic interventions. The dataset scale, use of FDR correction across 480 tests, and extension of an existing theoretical framework are clear strengths that would support practical implications for context-adaptive moderation design.

major comments (3)

[Abstract] Abstract: The claim that 'bot moderation consistently produces higher compliance and lower self-censorship' attributes outcomes causally to moderator source, yet the observational design compares outcomes across non-randomly assigned sources without demonstrated controls (e.g., subreddit fixed effects, violation-type stratification, or propensity weighting) for selection into moderator type or subreddit norms; the reported OLS and ANOVA results on PCA features therefore cannot isolate the source cue itself from the contexts in which each source appears.
[Methods/Results] Methods/Results (OLS and ANOVA sections): The manuscript does not detail whether the regression models include subreddit fixed effects, user-level clustering, or robustness checks such as propensity score weighting to address the non-random assignment of moderation sources noted in the skeptic's concern; without these, the source main effects and the 33 FDR-significant interactions remain vulnerable to confounding and cannot cleanly support the headline behavioral attribution.
[Abstract and Discussion] Abstract and Discussion: The extension of HAII-TIME by 'introducing violation salience as a moderator' is presented as a theoretical contribution, but the observational data leave open whether the reported severity-by-language interactions reflect cue processing or unmeasured differences in how severe violations are routed to different moderator sources and linguistic framings.

minor comments (2)

[Abstract] The abstract would benefit from a brief parenthetical definition or citation for 'probabilistic behavioral classification' to clarify how compliance and self-censorship are operationalized from the 11.8M events.
[Results] Figure or table captions for the linguistic interaction results should explicitly state the exact number of tests (480) and the FDR threshold applied so readers can assess the 33 significant findings without returning to the text.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the detailed and constructive comments. We address each major point below, clarifying our approach and indicating revisions where the manuscript can be strengthened without overstating the observational evidence.

read point-by-point responses

Referee: [Abstract] Abstract: The claim that 'bot moderation consistently produces higher compliance and lower self-censorship' attributes outcomes causally to moderator source, yet the observational design compares outcomes across non-randomly assigned sources without demonstrated controls (e.g., subreddit fixed effects, violation-type stratification, or propensity weighting) for selection into moderator type or subreddit norms; the reported OLS and ANOVA results on PCA features therefore cannot isolate the source cue itself from the contexts in which each source appears.

Authors: We agree that the phrasing 'produces' risks implying causation beyond what the observational data support. The reported OLS models control for violation severity, subreddit size, and other observed covariates, with violation-type stratification implicit in the interaction terms, but subreddit fixed effects and propensity weighting were not applied in the primary specifications. We will revise the abstract to use associative language ('is associated with') and add a dedicated robustness subsection describing these controls and limitations. revision: yes
Referee: [Methods/Results] Methods/Results (OLS and ANOVA sections): The manuscript does not detail whether the regression models include subreddit fixed effects, user-level clustering, or robustness checks such as propensity score weighting to address the non-random assignment of moderation sources noted in the skeptic's concern; without these, the source main effects and the 33 FDR-significant interactions remain vulnerable to confounding and cannot cleanly support the headline behavioral attribution.

Authors: The primary models include user-level random effects to address clustering and control for violation type and subreddit characteristics. Subreddit fixed effects were omitted from the main results to retain statistical power across 61,261 subreddits. We will expand the Methods section with complete model equations, explicit mention of the clustering approach, and new robustness analyses that incorporate subreddit fixed effects and propensity-score weighting on observable features such as subreddit activity and violation category. revision: yes
Referee: [Abstract and Discussion] Abstract and Discussion: The extension of HAII-TIME by 'introducing violation salience as a moderator' is presented as a theoretical contribution, but the observational data leave open whether the reported severity-by-language interactions reflect cue processing or unmeasured differences in how severe violations are routed to different moderator sources and linguistic framings.

Authors: The models explicitly interact linguistic features with violation severity while holding moderator source constant within strata, which provides evidence consistent with salience moderating cue effectiveness. We cannot fully exclude differential routing with observational data alone. We will revise the Discussion to acknowledge this limitation more explicitly, frame the HAII-TIME extension as an empirical pattern supporting the proposed moderator rather than a conclusive test, and suggest future experimental designs to isolate routing mechanisms. revision: partial

Circularity Check

0 steps flagged

No significant circularity; empirical analysis is self-contained

full rationale

The paper reports results from an observational dataset of 11.8M moderation events analyzed via probabilistic classification, ANOVA, and OLS regression on PCA-derived features. All load-bearing claims (bot moderation producing higher compliance, violation severity as moderator, 33 FDR-significant interactions) are statistical outputs from the data rather than quantities defined by the paper's own fitted parameters or reduced to self-citations by construction. The reference to HAII-TIME is used to frame the study and is extended by new empirical findings; it does not serve as a load-bearing premise whose validity depends on the present results. No self-definitional loops, fitted inputs called predictions, or ansatzes smuggled via citation appear in the derivation chain. The analysis is therefore independent of its own outputs and receives the default non-circularity finding.

Axiom & Free-Parameter Ledger

2 free parameters · 2 axioms · 0 invented entities

The analysis rests on standard statistical modeling assumptions and data classification procedures rather than new theoretical entities or derivations.

free parameters (2)

PCA-derived linguistic feature dimensions
Number and selection of principal components for language features fitted from the moderation message corpus.
OLS regression coefficients for interaction terms
Coefficients estimated from data to quantify effects of moderator type, severity, and language on behavioral outcomes.

axioms (2)

domain assumption Probabilistic behavioral classification correctly identifies compliance versus self-censorship from post-moderation activity logs
Central measurement step for the dependent variables.
domain assumption OLS regression assumptions (linearity, no omitted variable bias, homoscedasticity) hold for the behavioral outcome models
Required for interpreting coefficient estimates as effects.

pith-pipeline@v0.9.0 · 5806 in / 1379 out tokens · 49982 ms · 2026-05-19T21:45:39.137253+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Using probabilistic behavioral classification, one-way ANOVA, and OLS regression with principal component analysis (PCA)-derived linguistic features...
IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Bot moderation consistently produces higher compliance and lower self-censorship than human or modteam moderation

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

43 extracted references · 43 canonical work pages

[1]

K., & boyd danah, d

Baym, N. K., & boyd danah, d. (2012). Socially Mediated Publicness: An Introduction [ eprint: https://doi.org/10.1080/08838151.2012.705200].Journal of Broadcasting & Electronic Media, 56(3), 320–329. https://doi.org/10.1080/08838151.2012.705200

work page doi:10.1080/08838151.2012.705200 2012
[2]

Binns, R., Van Kleek, M., Veale, M., Lyngs, U., Zhao, J., & Shadbolt, N. (2018). ’It’s Reducing a Human Being to a Percentage’: Perceptions of Justice in Algorithmic Decisions.Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems, 1–14. https://doi.org/ 10.1145/3173574.3173951

work page doi:10.1145/3173574.3173951 2018
[3]

, year 2007

Braithwaite, J. (2001).Restorative Justice & Responsive Regulation(1st ed.). Oxford University Press. https://doi.org/10.1093/oso/9780195136395.001.0001

work page doi:10.1093/oso/9780195136395.001.0001 2001
[4]

Brehm, J. W. (1966).A Theory of Psychological Reactance. Academic Press

work page 1966
[5]

Brown, P., & Levinson, S. C. (1987).Politeness: Some Universals in Language Usage. Cambridge University Press

work page 1987
[6]

(2018).Content or Context Moderation? Artisanal, Community, and Industrial Approaches (tech

Caplan, R. (2018).Content or Context Moderation? Artisanal, Community, and Industrial Approaches (tech. rep.). Data & Society Research Institute. New York. https://datasociety.net/library/ content-or-context-moderation/

work page 2018
[7]

Chandrasekharan, E., Pavalanathan, U., Srinivasan, A., Glynn, A., Eisenstein, J., & Gilbert, E. (2017). You Can’t Stay Here: The Efficacy of Reddit’s 2015 Ban Examined Through Hate Speech. Proceedings of the ACM Conference on Computer Supported Cooperative Work and Social Computing (CSCW), 1–18. https://doi.org/10.1145/2998181.2998215

work page doi:10.1145/2998181.2998215 2017
[8]

Chandrasekharan, E., Samory, M., Jhaver, S., Charvat, H., Bruckman, A., Lampe, C., Eisenstein, J., & Gilbert, E. (2018). The Internet’s Hidden Rules: An Empirical Study of Reddit Norm Violations at Micro, Meso, and Macro Scales.Proc. ACM Hum.-Comput. Interact.,2(CSCW). https://doi.org/10.1145/3274301

work page doi:10.1145/3274301 2018
[9]

Chandrasekharan, E., Samory, M., Srinivasan, A., & Gilbert, E. (2022). Quarantined! Examining the Effects of Reddit Quarantines on Online Hate and Behavior.Proceedings of the International AAAI Conference on Web and Social Media (ICWSM),16(1), 109–120

work page 2022
[10]

Chang, J., Zhang, H., & Danescu-Niculescu-Mizil, C. (2022). Echoes of Moderation: How Banning Affects the Spread of Toxic Content Online.Proceedings of the International AAAI Conference on Web and Social Media (ICWSM),16(1), 76–87

work page 2022
[11]

S., Hancock, J

Christin, A., Bernstein, M. S., Hancock, J. T., Jia, C., Mado, M. N., Tsai, J. L., & Xu, C. (2024). Inter- nal Fractures: The Competing Logics of Social Media Platforms [eprint: https://doi.org/10.1177/20563051241274668]. Social Media + Society,10(3), 20563051241274668. https://doi.org/10.1177/20563051241274668

work page doi:10.1177/20563051241274668 2024
[12]

\ Goldstein, N J

Cialdini, R. B., & Goldstein, N. J. (2004). Social Influence: Compliance and Conformity.Annual review of psychology,55(1), 591–621. https://doi.org/10.1146/annurev.psych.55.090902.142015

work page doi:10.1146/annurev.psych.55.090902.142015 2004
[13]

L., & Ryan, R

Deci, E. L., & Ryan, R. M. (2000). The ”what” and ”why” of goal pursuits: Human needs and the self-determination of behavior.Psychological Inquiry,11(4), 227–268

work page 2000
[14]

A., Gergle, D., & Birnholtz, J

DeVito, M. A., Gergle, D., & Birnholtz, J. (2017). ”Algorithms ruin everything”: #RIPTwitter, Folk Theories, and Resistance to Algorithmic Change in Social Media.Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems, 3163–3174. https://doi.org/10.1145/ 3025453.3025659

work page arXiv 2017
[15]

P., & Shen, L

Dillard, J. P., & Shen, L. (2005). On the nature of reactance and its role in persuasive health commu- nication.Communication Monographs,72(2), 144–168

work page 2005
[16]

Ferrara, E., Varol, O., Davis, C., Menczer, F., & Flammini, A. (2016). The rise of social bots.Commun. ACM,59(7), 96–104. https://doi.org/10.1145/2818717

work page doi:10.1145/2818717 2016
[17]

Gerrard, Y. (2018). Beyond Hashtags: Coded Discourse in the Pro–Eating Disorder Community on Instagram.New Media & Society,20(12), 4653–4670

work page 2018
[18]

(2018).Custodians of the Internet: Platforms, Content Moderation, and the Hidden De- cisions That Shape Social Media

Gillespie, T. (2018).Custodians of the Internet: Platforms, Content Moderation, and the Hidden De- cisions That Shape Social Media. Yale University Press

work page 2018
[19]

(2019, December).Custodians of the Internet: Platforms, Content Moderation, and the Hidden Decisions That Shape Social Media

Gillespie, T. (2019, December).Custodians of the Internet: Platforms, Content Moderation, and the Hidden Decisions That Shape Social Media. Yale University Press. https://doi.org/10.12987/ 9780300235029

work page 2019
[20]

Gillespie, T. (2022). Do Not Recommend? Reduction as a Form of Content Moderation [ eprint: https://doi.org/10.1177/20563051221117552].Social Media + Society,8(3), 20563051221117552. https://doi.org/10.1177/20563051221117552 18 Gon¸ calves, J., Weber, I., Masullo, G. M., Silva, M. T. d., & Hofhuis, J. (2023). Common sense or censorship: How algorithmic mo...

work page doi:10.1177/20563051221117552 2022
[21]

Grimmelmann, J. (2015). The virtues of moderation.Yale Journal of Law & Technology,17, 42–109. Horta Ribeiro, M., Jhaver, S., Zannettou, S., Blackburn, J., Stringhini, G., De Cristofaro, E., &

work page 2015
[22]

West, R. (2021). Do Platform Migrations Compromise Content Moderation? Evidence from r/The donald and r/Incels.Proc. ACM Hum.-Comput. Interact.,5(CSCW2). https://doi.org/ 10.1145/3476057

work page doi:10.1145/3476057 2021
[23]

(2006).Convergence Culture

Jenkins, H. (2006).Convergence Culture. NYU Press. Retrieved April 10, 2026, from http://www. jstor.org/stable/j.ctt9qffwr

work page 2006
[24]

Jhaver, S., Birman, I., Gilbert, E., & Bruckman, A. (2019). Did You Suspect the Post Would Be Removed? Understanding User Reactions to Content Moderation on Reddit.Proceedings of the ACM on Human-Computer Interaction (CSCW),3(CSCW), 1–33

work page 2019
[25]

Jhaver, S., Birman, I., Gilbert, E., & Bruckman, A. (2021). Measuring the Effectiveness of Content Moderation Efforts on YouTube.Proceedings of the ACM on Human-Computer Interaction (CSCW),5(CSCW2), 1–27

work page 2021
[26]

Jhaver, S., Bruckman, A., & Gilbert, E. (2019). Does Transparency in Moderation Affect User Be- havior?Proceedings of the ACM Conference on Human Factors in Computing Systems (CHI), 1–14. https://doi.org/10.1145/3290605.3300479

work page doi:10.1145/3290605.3300479 2019
[27]

Jhaver, S., Rathi, H., & Saha, K. (2024). Bystanders of Online Moderation: Examining the Effects of Witnessing Post-Removal Explanations.Proceedings of the CHI Conference on Human Factors in Computing Systems (CHI ’24), 1–9. https://doi.org/10.1145/3613904.3642204

work page doi:10.1145/3613904.3642204 2024
[28]

’., Middler, S., Brubaker, J

Jiang, J. ’., Middler, S., Brubaker, J. R., & Fiesler, C. (2020). Characterizing Community Guidelines on Social Media Platforms.Companion Publication of the 2020 Conference on Computer Sup- ported Cooperative Work and Social Computing, 287–291. https://doi.org/10.1145/3406865. 3418312

work page doi:10.1145/3406865 2020
[29]

D., & Sundar, S

Molina, M. D., & Sundar, S. S. (2022). When AI moderates online content: Effects of human collabora- tion and interactive transparency on user trust [eprint: https://academic.oup.com/jcmc/article- pdf/27/4/zmac010/45048191/zmac010.pdf].Journal of Computer-Mediated Communication, 27(4), zmac010. https://doi.org/10.1093/jcmc/zmac010 Myers West, S. (2018). C...

work page doi:10.1093/jcmc/zmac010 2022
[30]

Nass, C., Steuer, J., & Tauber, E. R. (1994). Computers are social actors.Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, 72–78. https://doi.org/10.1145/191666. 191703

work page doi:10.1145/191666 1994
[31]

Penney, J. W. (2017). Chilling effects: Online surveillance and Wikipedia use.Berkeley Technology Law Journal,31(1), 117–182

work page 2017
[32]

E., & Cacioppo, J

Petty, R. E., & Cacioppo, J. T. (1986).Communication and Persuasion: Central and Peripheral Routes to Attitude Change. Springer

work page 1986
[33]

Puschmann, C. (2021). Coded Speech and Platform Governance.Internet, Policy & Politics Confer- ence

work page 2021
[34]

S., & Fiske, A

Rai, T. S., & Fiske, A. P. (2011). Moral Psychology Is Relationship Regulation: Moral Motives for

work page 2011
[35]

https: //doi.org/10.1037/a0021867

Unity, Hierarchy, Equality, and Proportionality.Psychological review,118(1), 57–75. https: //doi.org/10.1037/a0021867

work page doi:10.1037/a0021867
[36]

Roberts, M. E. (2018).Censored: Distraction and Diversion Inside China’s Great Firewall. Princeton University Press

work page 2018
[37]

M., & Ruths, D

Saleem, H. M., & Ruths, D. (2018). The Aftermath of Reddit Bans on Hate Communities.Proceedings of the International AAAI Conference on Web and Social Media (ICWSM),12(1), 313–322

work page 2018
[38]

Schauer, F. (1978). Fear, risk and the first amendment: Unraveling the ”chilling effect”.Boston Uni- versity Law Review,58, 685–732

work page 1978
[39]

B., Danescu-Niculescu-Mizil, C., Lee, L., & Tan, C

Srinivasan, K. B., Danescu-Niculescu-Mizil, C., Lee, L., & Tan, C. (2019). Content Removal as a Moderation Strategy: Compliance and Other Outcomes in the ChangeMyView Community. Proc. ACM Hum.-Comput. Interact.,3(CSCW). https://doi.org/10.1145/3359265 19

work page doi:10.1145/3359265 2019
[40]

Sundar, S. S. (2020). Rise of machine agency: A framework for studying the psychology of human-AI interaction (HAII).Journal of Computer-Mediated Communication,25(1), 74–88

work page 2020
[42]

https://doi.org/https://doi-org.libproxy2.usc.edu/10.1002/9781118426456.ch3

Sons, Ltd. https://doi.org/https://doi-org.libproxy2.usc.edu/10.1002/9781118426456.ch3

work page doi:10.1002/9781118426456.ch3
[43]

Tyler, T. R. (1990).Why People Obey the Law. Yale University Press

work page 1990
[44]

WALTHER, J. B. (1996). Computer-Mediated Communication: Impersonal, Interpersonal, and Hyper- personal Interaction [ eprint: https://doi.org/10.1177/009365096023001001].Communication Research,23(1), 3–43. https://doi.org/10.1177/009365096023001001 20 Appendix Data Overview Table 2: Summary Statistics of Moderator Roles and Activity Metric Bot Modteam Pers...

work page doi:10.1177/009365096023001001 1996

[1] [1]

K., & boyd danah, d

Baym, N. K., & boyd danah, d. (2012). Socially Mediated Publicness: An Introduction [ eprint: https://doi.org/10.1080/08838151.2012.705200].Journal of Broadcasting & Electronic Media, 56(3), 320–329. https://doi.org/10.1080/08838151.2012.705200

work page doi:10.1080/08838151.2012.705200 2012

[2] [2]

Binns, R., Van Kleek, M., Veale, M., Lyngs, U., Zhao, J., & Shadbolt, N. (2018). ’It’s Reducing a Human Being to a Percentage’: Perceptions of Justice in Algorithmic Decisions.Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems, 1–14. https://doi.org/ 10.1145/3173574.3173951

work page doi:10.1145/3173574.3173951 2018

[3] [3]

, year 2007

Braithwaite, J. (2001).Restorative Justice & Responsive Regulation(1st ed.). Oxford University Press. https://doi.org/10.1093/oso/9780195136395.001.0001

work page doi:10.1093/oso/9780195136395.001.0001 2001

[4] [4]

Brehm, J. W. (1966).A Theory of Psychological Reactance. Academic Press

work page 1966

[5] [5]

Brown, P., & Levinson, S. C. (1987).Politeness: Some Universals in Language Usage. Cambridge University Press

work page 1987

[6] [6]

(2018).Content or Context Moderation? Artisanal, Community, and Industrial Approaches (tech

Caplan, R. (2018).Content or Context Moderation? Artisanal, Community, and Industrial Approaches (tech. rep.). Data & Society Research Institute. New York. https://datasociety.net/library/ content-or-context-moderation/

work page 2018

[7] [7]

Chandrasekharan, E., Pavalanathan, U., Srinivasan, A., Glynn, A., Eisenstein, J., & Gilbert, E. (2017). You Can’t Stay Here: The Efficacy of Reddit’s 2015 Ban Examined Through Hate Speech. Proceedings of the ACM Conference on Computer Supported Cooperative Work and Social Computing (CSCW), 1–18. https://doi.org/10.1145/2998181.2998215

work page doi:10.1145/2998181.2998215 2017

[8] [8]

Chandrasekharan, E., Samory, M., Jhaver, S., Charvat, H., Bruckman, A., Lampe, C., Eisenstein, J., & Gilbert, E. (2018). The Internet’s Hidden Rules: An Empirical Study of Reddit Norm Violations at Micro, Meso, and Macro Scales.Proc. ACM Hum.-Comput. Interact.,2(CSCW). https://doi.org/10.1145/3274301

work page doi:10.1145/3274301 2018

[9] [9]

Chandrasekharan, E., Samory, M., Srinivasan, A., & Gilbert, E. (2022). Quarantined! Examining the Effects of Reddit Quarantines on Online Hate and Behavior.Proceedings of the International AAAI Conference on Web and Social Media (ICWSM),16(1), 109–120

work page 2022

[10] [10]

Chang, J., Zhang, H., & Danescu-Niculescu-Mizil, C. (2022). Echoes of Moderation: How Banning Affects the Spread of Toxic Content Online.Proceedings of the International AAAI Conference on Web and Social Media (ICWSM),16(1), 76–87

work page 2022

[11] [11]

S., Hancock, J

Christin, A., Bernstein, M. S., Hancock, J. T., Jia, C., Mado, M. N., Tsai, J. L., & Xu, C. (2024). Inter- nal Fractures: The Competing Logics of Social Media Platforms [eprint: https://doi.org/10.1177/20563051241274668]. Social Media + Society,10(3), 20563051241274668. https://doi.org/10.1177/20563051241274668

work page doi:10.1177/20563051241274668 2024

[12] [12]

\ Goldstein, N J

Cialdini, R. B., & Goldstein, N. J. (2004). Social Influence: Compliance and Conformity.Annual review of psychology,55(1), 591–621. https://doi.org/10.1146/annurev.psych.55.090902.142015

work page doi:10.1146/annurev.psych.55.090902.142015 2004

[13] [13]

L., & Ryan, R

Deci, E. L., & Ryan, R. M. (2000). The ”what” and ”why” of goal pursuits: Human needs and the self-determination of behavior.Psychological Inquiry,11(4), 227–268

work page 2000

[14] [14]

A., Gergle, D., & Birnholtz, J

DeVito, M. A., Gergle, D., & Birnholtz, J. (2017). ”Algorithms ruin everything”: #RIPTwitter, Folk Theories, and Resistance to Algorithmic Change in Social Media.Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems, 3163–3174. https://doi.org/10.1145/ 3025453.3025659

work page arXiv 2017

[15] [15]

P., & Shen, L

Dillard, J. P., & Shen, L. (2005). On the nature of reactance and its role in persuasive health commu- nication.Communication Monographs,72(2), 144–168

work page 2005

[16] [16]

Ferrara, E., Varol, O., Davis, C., Menczer, F., & Flammini, A. (2016). The rise of social bots.Commun. ACM,59(7), 96–104. https://doi.org/10.1145/2818717

work page doi:10.1145/2818717 2016

[17] [17]

Gerrard, Y. (2018). Beyond Hashtags: Coded Discourse in the Pro–Eating Disorder Community on Instagram.New Media & Society,20(12), 4653–4670

work page 2018

[18] [18]

(2018).Custodians of the Internet: Platforms, Content Moderation, and the Hidden De- cisions That Shape Social Media

Gillespie, T. (2018).Custodians of the Internet: Platforms, Content Moderation, and the Hidden De- cisions That Shape Social Media. Yale University Press

work page 2018

[19] [19]

(2019, December).Custodians of the Internet: Platforms, Content Moderation, and the Hidden Decisions That Shape Social Media

Gillespie, T. (2019, December).Custodians of the Internet: Platforms, Content Moderation, and the Hidden Decisions That Shape Social Media. Yale University Press. https://doi.org/10.12987/ 9780300235029

work page 2019

[20] [20]

Gillespie, T. (2022). Do Not Recommend? Reduction as a Form of Content Moderation [ eprint: https://doi.org/10.1177/20563051221117552].Social Media + Society,8(3), 20563051221117552. https://doi.org/10.1177/20563051221117552 18 Gon¸ calves, J., Weber, I., Masullo, G. M., Silva, M. T. d., & Hofhuis, J. (2023). Common sense or censorship: How algorithmic mo...

work page doi:10.1177/20563051221117552 2022

[21] [21]

Grimmelmann, J. (2015). The virtues of moderation.Yale Journal of Law & Technology,17, 42–109. Horta Ribeiro, M., Jhaver, S., Zannettou, S., Blackburn, J., Stringhini, G., De Cristofaro, E., &

work page 2015

[22] [22]

West, R. (2021). Do Platform Migrations Compromise Content Moderation? Evidence from r/The donald and r/Incels.Proc. ACM Hum.-Comput. Interact.,5(CSCW2). https://doi.org/ 10.1145/3476057

work page doi:10.1145/3476057 2021

[23] [23]

(2006).Convergence Culture

Jenkins, H. (2006).Convergence Culture. NYU Press. Retrieved April 10, 2026, from http://www. jstor.org/stable/j.ctt9qffwr

work page 2006

[24] [24]

Jhaver, S., Birman, I., Gilbert, E., & Bruckman, A. (2019). Did You Suspect the Post Would Be Removed? Understanding User Reactions to Content Moderation on Reddit.Proceedings of the ACM on Human-Computer Interaction (CSCW),3(CSCW), 1–33

work page 2019

[25] [25]

Jhaver, S., Birman, I., Gilbert, E., & Bruckman, A. (2021). Measuring the Effectiveness of Content Moderation Efforts on YouTube.Proceedings of the ACM on Human-Computer Interaction (CSCW),5(CSCW2), 1–27

work page 2021

[26] [26]

Jhaver, S., Bruckman, A., & Gilbert, E. (2019). Does Transparency in Moderation Affect User Be- havior?Proceedings of the ACM Conference on Human Factors in Computing Systems (CHI), 1–14. https://doi.org/10.1145/3290605.3300479

work page doi:10.1145/3290605.3300479 2019

[27] [27]

Jhaver, S., Rathi, H., & Saha, K. (2024). Bystanders of Online Moderation: Examining the Effects of Witnessing Post-Removal Explanations.Proceedings of the CHI Conference on Human Factors in Computing Systems (CHI ’24), 1–9. https://doi.org/10.1145/3613904.3642204

work page doi:10.1145/3613904.3642204 2024

[28] [28]

’., Middler, S., Brubaker, J

Jiang, J. ’., Middler, S., Brubaker, J. R., & Fiesler, C. (2020). Characterizing Community Guidelines on Social Media Platforms.Companion Publication of the 2020 Conference on Computer Sup- ported Cooperative Work and Social Computing, 287–291. https://doi.org/10.1145/3406865. 3418312

work page doi:10.1145/3406865 2020

[29] [29]

D., & Sundar, S

Molina, M. D., & Sundar, S. S. (2022). When AI moderates online content: Effects of human collabora- tion and interactive transparency on user trust [eprint: https://academic.oup.com/jcmc/article- pdf/27/4/zmac010/45048191/zmac010.pdf].Journal of Computer-Mediated Communication, 27(4), zmac010. https://doi.org/10.1093/jcmc/zmac010 Myers West, S. (2018). C...

work page doi:10.1093/jcmc/zmac010 2022

[30] [30]

Nass, C., Steuer, J., & Tauber, E. R. (1994). Computers are social actors.Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, 72–78. https://doi.org/10.1145/191666. 191703

work page doi:10.1145/191666 1994

[31] [31]

Penney, J. W. (2017). Chilling effects: Online surveillance and Wikipedia use.Berkeley Technology Law Journal,31(1), 117–182

work page 2017

[32] [32]

E., & Cacioppo, J

Petty, R. E., & Cacioppo, J. T. (1986).Communication and Persuasion: Central and Peripheral Routes to Attitude Change. Springer

work page 1986

[33] [33]

Puschmann, C. (2021). Coded Speech and Platform Governance.Internet, Policy & Politics Confer- ence

work page 2021

[34] [34]

S., & Fiske, A

Rai, T. S., & Fiske, A. P. (2011). Moral Psychology Is Relationship Regulation: Moral Motives for

work page 2011

[35] [35]

https: //doi.org/10.1037/a0021867

Unity, Hierarchy, Equality, and Proportionality.Psychological review,118(1), 57–75. https: //doi.org/10.1037/a0021867

work page doi:10.1037/a0021867

[36] [36]

Roberts, M. E. (2018).Censored: Distraction and Diversion Inside China’s Great Firewall. Princeton University Press

work page 2018

[37] [37]

M., & Ruths, D

Saleem, H. M., & Ruths, D. (2018). The Aftermath of Reddit Bans on Hate Communities.Proceedings of the International AAAI Conference on Web and Social Media (ICWSM),12(1), 313–322

work page 2018

[38] [38]

Schauer, F. (1978). Fear, risk and the first amendment: Unraveling the ”chilling effect”.Boston Uni- versity Law Review,58, 685–732

work page 1978

[39] [39]

B., Danescu-Niculescu-Mizil, C., Lee, L., & Tan, C

Srinivasan, K. B., Danescu-Niculescu-Mizil, C., Lee, L., & Tan, C. (2019). Content Removal as a Moderation Strategy: Compliance and Other Outcomes in the ChangeMyView Community. Proc. ACM Hum.-Comput. Interact.,3(CSCW). https://doi.org/10.1145/3359265 19

work page doi:10.1145/3359265 2019

[40] [40]

Sundar, S. S. (2020). Rise of machine agency: A framework for studying the psychology of human-AI interaction (HAII).Journal of Computer-Mediated Communication,25(1), 74–88

work page 2020

[41] [42]

https://doi.org/https://doi-org.libproxy2.usc.edu/10.1002/9781118426456.ch3

Sons, Ltd. https://doi.org/https://doi-org.libproxy2.usc.edu/10.1002/9781118426456.ch3

work page doi:10.1002/9781118426456.ch3

[42] [43]

Tyler, T. R. (1990).Why People Obey the Law. Yale University Press

work page 1990

[43] [44]

WALTHER, J. B. (1996). Computer-Mediated Communication: Impersonal, Interpersonal, and Hyper- personal Interaction [ eprint: https://doi.org/10.1177/009365096023001001].Communication Research,23(1), 3–43. https://doi.org/10.1177/009365096023001001 20 Appendix Data Overview Table 2: Summary Statistics of Moderator Roles and Activity Metric Bot Modteam Pers...

work page doi:10.1177/009365096023001001 1996