Assessing How Hate, Counterspeech, and Toxicity Affect Hate Group Newcomers

Daniel Hickey; Daniel M.T. Fessler; Goran Muri\'c; Keith Burghardt; Kristina Lerman; Matheus Schmitz; Paul E. Smaldino

arxiv: 2405.18374 · v2 · submitted 2024-05-28 · 💻 cs.CY · cs.HC

Assessing How Hate, Counterspeech, and Toxicity Affect Hate Group Newcomers

Daniel Hickey , Matheus Schmitz , Daniel M.T. Fessler , Paul E. Smaldino , Kristina Lerman , Goran Muri\'c , Keith Burghardt This is my paper

Pith reviewed 2026-05-24 01:12 UTC · model grok-4.3

classification 💻 cs.CY cs.HC

keywords counterspeechhate speechtoxicityRedditnewcomersonline communitiesretentionsocial media

0 comments

The pith

Newcomers who post hate speech and receive counterspeech are less likely to continue posting in hate subreddits.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper studies how counterspeech influences whether newcomers keep participating in online hate communities after they post hate speech. Using data from over 16,000 newcomers in 104 Reddit hate subreddits, it shows that counterspeech reduces the chance they will post again, instead of making them more committed. It also measures toxicity in counterspeech and finds that while counterspeech is more toxic than normal discussion, its toxicity level does not change the retention outcome but does increase the odds of further hostile replies from the newcomer. These patterns matter for efforts to limit hate speech because they indicate that responses can discourage new participants rather than entrench them.

Core claim

Newcomers using hate speech who receive counterspeech are less likely to continue posting within these hate subreddits, rather than becoming galvanized. Counterspeech comments are less toxic than hate speech comments but almost twice as toxic as other discourse. No association exists between the toxicity of counterspeech and its effects on user retention, yet toxic counterspeech increases the probability of continued hostility from hate users within the same discussion.

What carries the argument

LLM-based counterspeech detection applied to observational posting records of 16,513 newcomers across 104 hate subreddits.

If this is right

Counterspeech reduces the likelihood that newcomers continue posting in hate subreddits.
Toxicity of counterspeech has no measurable effect on whether users stay or leave the community.
Toxic counterspeech raises the probability of continued hostile replies from the original poster in the same thread.
Many newcomers may be testing proscribed beliefs rather than acting as committed adherents.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Platforms might reduce newcomer involvement by promoting or surfacing counterspeech early in threads.
The retention drop could reflect boundary-testing behavior that fades once the belief meets opposition.
Effects observed on Reddit may differ on platforms with different moderation norms or reply visibility.
Longer-term tracking could reveal whether the initial drop in posting leads to permanent disengagement or migration elsewhere.

Load-bearing premise

The LLM-based counterspeech detection accurately identifies true counterspeech without significant false positives or negatives, and the observational data allows inferring the effect of counterspeech on retention without major confounding.

What would settle it

A controlled experiment that randomly assigns counterspeech replies to some hate-posting newcomers and measures their subsequent posting rates in the same subreddits.

Figures

Figures reproduced from arXiv: 2405.18374 by Daniel Hickey, Daniel M.T. Fessler, Goran Muri\'c, Keith Burghardt, Kristina Lerman, Matheus Schmitz, Paul E. Smaldino.

**Figure 2.** Figure 2: Performance of counterspeech detection models [PITH_FULL_IMAGE:figures/full_fig_p007_2.png] view at source ↗

**Figure 3.** Figure 3: Mean toxicity by type of reply. Black vertical lines [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗

**Figure 5.** Figure 5: Probability of receiving a toxic follow-up reply [PITH_FULL_IMAGE:figures/full_fig_p008_5.png] view at source ↗

read the original abstract

Counterspeech has gained attention as a strategy to reduce hate speech on social media. Although previous studies suggest that counterspeech can reduce hate speech, little is known about its effects on participation in online hate communities. Relatedly, we lack an understanding about the degree of hostility in counterspeech. Hostile counterspeech may increase online conflict, potentially hardening the positions of hate adherents, and further eroding online environments. Here, we analyzed the effect of counterspeech on 16,513 newcomers across 104 hate subreddits (forums within Reddit.com). We devised an LLM-based counterspeech detection approach that outperforms specialized models trained on existing datasets, then examined the presence, and effects of, hostility. While counterspeech comments are less toxic than hate speech comments, they are almost twice as toxic as other discourse within hate subreddits. We then evaluated the effect of counterspeech on newcomer engagement in hate subreddits. We found that newcomers using hate speech who receive counterspeech are less likely to continue posting within these hate subreddits, rather than becoming galvanized. We speculate that, instead of constituting ardent hate adherents, readily-dissuaded newcomers may merely be toying with beliefs that are proscribed in other contexts. Although we found no association between the toxicity of counterspeech and its effects on user retention, consistent with prior research regarding the harmful effects of toxic speech, we found that toxic counterspeech increases the probability of continued hostility from hate users within the same discussion.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper shows counterspeech correlates with lower retention among hate-posting newcomers on Reddit, but the observational setup leaves selection effects as a live alternative.

read the letter

The core result is that newcomers posting hate speech who receive counterspeech are less likely to keep posting in those subreddits. The authors also report that counterspeech sits between hate speech and ordinary comments on toxicity, and that toxic counterspeech links to more continued hostility in the thread but not to retention differences. They built an LLM detector that beats prior specialized models on benchmark datasets and applied it to 16k newcomers across 104 hate subreddits. That scale and the shift from speech volume to participation are the clearest additions to the literature on counterspeech. The toxicity comparisons are direct and line up with earlier patterns. The retention finding is new in this setting. The design is observational, so differences in who receives counterspeech could reflect user traits, post visibility, or subreddit norms that also predict disengagement. No fixed effects, matching, or other steps to address that are described. The LLM is validated on existing datasets rather than on labels from this corpus of newcomer posts, which leaves error rates on the target material unclear. The abstract does not mention robustness checks or uncertainty measures around the main estimates. Readers working on platform moderation or online extremism would find the participation angle useful even with those limits. The data effort is large enough that a referee could usefully press on identification and validation without starting from scratch. I would send it for review rather than desk reject.

Referee Report

3 major / 1 minor

Summary. The paper analyzes counterspeech effects on 16,513 newcomers across 104 hate subreddits using Reddit data. It introduces an LLM-based counterspeech detector claimed to outperform specialized models, reports that counterspeech is less toxic than hate speech but nearly twice as toxic as other subreddit discourse, and finds that hate-speech-posting newcomers receiving counterspeech are less likely to continue posting (rather than becoming galvanized). It further notes no association between counterspeech toxicity and retention but an increase in continued hostility from toxic counterspeech within the same thread.

Significance. If the central observational association can be shown to support causal inference, the result would contribute to computational social science by suggesting counterspeech deters rather than entrenches participation in hate communities, with implications for moderation design. The scale of the newcomer cohort and the toxicity comparisons add empirical value, though the lack of reported identification strategies limits immediate policy weight.

major comments (3)

[Abstract and Results] Abstract and Results sections: the claim that counterspeech recipients show lower retention is presented as an effect on engagement, yet the analysis is purely observational with no reported fixed effects, propensity matching, or other identification strategy to address selection on unobservables (e.g., pre-existing user commitment, post visibility, or subreddit norms that jointly predict both counterspeech receipt and disengagement).
[Methods] Methods section: the LLM counterspeech detector is validated only against existing external datasets rather than against human labels collected on the specific corpus of newcomer hate-speech posts; this gap directly affects the reliability of the downstream retention comparison.
[Results] Results section: no error bars, confidence intervals, or robustness checks (e.g., alternative classifiers, subsample analyses) are mentioned for the reported retention differences, undermining assessment of whether the observed association is statistically distinguishable from noise.

minor comments (1)

[Abstract] The abstract could more explicitly distinguish the reported association from a causal claim to avoid over-interpretation by readers.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for their constructive and detailed comments. We address each major comment below, clarifying the observational nature of the study and outlining planned revisions to improve clarity, transparency, and rigor.

read point-by-point responses

Referee: [Abstract and Results] Abstract and Results sections: the claim that counterspeech recipients show lower retention is presented as an effect on engagement, yet the analysis is purely observational with no reported fixed effects, propensity matching, or other identification strategy to address selection on unobservables (e.g., pre-existing user commitment, post visibility, or subreddit norms that jointly predict both counterspeech receipt and disengagement).

Authors: We agree that the analysis is observational and does not include causal identification strategies. The manuscript employs the term 'effect' in a descriptive sense, but to prevent any implication of causality we will revise the abstract and results sections to use 'association' and 'relationship' throughout. We will also add an explicit statement in the methods and discussion sections noting the observational design and the absence of controls for selection on unobservables. revision: yes
Referee: [Methods] Methods section: the LLM counterspeech detector is validated only against existing external datasets rather than against human labels collected on the specific corpus of newcomer hate-speech posts; this gap directly affects the reliability of the downstream retention comparison.

Authors: Validation was conducted on established external datasets to benchmark against prior work. We acknowledge that human annotation on the specific newcomer corpus would provide stronger domain-specific evidence. Because new annotation collection lies outside the scope of the current study, we will add a dedicated limitations paragraph discussing this gap and its potential implications for the retention analyses. revision: partial
Referee: [Results] Results section: no error bars, confidence intervals, or robustness checks (e.g., alternative classifiers, subsample analyses) are mentioned for the reported retention differences, undermining assessment of whether the observed association is statistically distinguishable from noise.

Authors: We will incorporate confidence intervals or error bars for all reported retention differences. In addition, we will perform and report robustness checks using alternative classifiers and relevant subsample analyses in the revised results section. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical observational analysis with external data and validation

full rationale

The paper conducts an observational study on Reddit data for 16,513 newcomers across 104 subreddits, using an LLM-based detector validated against existing datasets and examining associations with retention and toxicity. No equations, derivations, or first-principles claims are presented that reduce to fitted inputs or self-definitions by construction. No load-bearing self-citations or uniqueness theorems from prior author work are invoked to force results. The central findings (lower retention after counterspeech, toxicity comparisons) are statistical associations from external corpus, not renamings or predictions equivalent to inputs. This is a standard empirical paper self-contained against benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The study relies on assumptions about data labeling and detection accuracy typical in social media research; no free parameters or new entities introduced.

axioms (2)

domain assumption Reddit subreddits labeled as hate communities accurately represent online hate groups.
The study selects 104 hate subreddits without detailing how they were identified or validated.
domain assumption LLM can reliably detect counterspeech in context of hate speech.
The paper devises an LLM-based approach but details not in abstract.

pith-pipeline@v0.9.0 · 5828 in / 1427 out tokens · 35633 ms · 2026-05-24T01:12:00.248937+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

73 extracted references · 73 canonical work pages · 1 internal anchor

[1]

, " * write output.state after.block = add.period write newline

ENTRY address archivePrefix author booktitle chapter edition editor eid eprint howpublished institution isbn journal key month note number organization pages publisher school series title type volume year label extra.label sort.label short.list INTEGERS output.state before.all mid.sentence after.sentence after.block FUNCTION init.state.consts #0 'before.a...

work page
[2]

write newline

" write newline "" before.all 'output.state := FUNCTION n.dashify 't := "" t empty not t #1 #1 substring "-" = t #1 #2 substring "--" = not "--" * t #2 global.max substring 't := t #1 #1 substring "-" = "-" * t #2 global.max substring 't := while if t #1 #1 substring * t #2 global.max substring 't := if while FUNCTION word.in bbl.in capitalize " " * FUNCT...

work page
[3]

Alyahya, G.; and Aldayel, A. 2025. Hatred stems from ignorance! distillation of the persuasion modes in countering conversational hate speech. In ICWSM, volume 19, 52--67

work page 2025
[4]

Austin, P. C. 2011. An introduction to propensity score methods for reducing the effects of confounding in observational studies. Multivariate behavioral research, 46(3): 399--424

work page 2011
[5]

Baider, F. 2023. Accountability issues, online covert hate speech, and the efficacy of counter-speech. Politics and Governance, 11(2): 249--260

work page 2023
[6]

A.; Argyle, L

Bail, C. A.; Argyle, L. P.; Brown, T. W.; Bumpus, J. P.; Chen, H.; Hunzaker, M. F.; Lee, J.; Mann, M.; Merhout, F.; and Volfovsky, A. 2018. Exposure to opposing views on social media can increase political polarization. PNAS, 115(37): 9216--9221

work page 2018
[7]

B \"a r, D.; Maarouf, A.; and Feuerriegel, S. 2024. Generative AI may backfire for counterspeech. arXiv preprint arXiv:2411.14986

work page arXiv 2024
[8]

Baumgartner, J.; Zannettou, S.; Keegan, B.; Squire, M.; and Blackburn, J. 2020. The pushshift reddit dataset. In ICWSM, volume 14, 830--839

work page 2020
[9]

Beknazar-Yuzbashev, G.; Jim \'e nez-Dur \'a n, R.; McCrosky, J.; and Stalinski, M. 2025. Toxic content and user engagement on social media: Evidence from a field experiment. Technical report, CESifo Working Paper

work page 2025
[10]

Buntain, C.; Innes, M.; Mitts, T.; and Shapiro, J. 2023. Cross-platform reactions to the post-January 6 deplatforming. Journal of Quantitative Description: Digital Media, 3

work page 2023
[11]

Chan, J.; Ghose, A.; and Seamans, R. 2016. The internet and racial hate crime. Mis Quarterly, 40(2): 381--404

work page 2016
[12]

Chandrasekharan, E.; Jhaver, S.; Bruckman, A.; and Gilbert, E. 2022. Quarantined! Examining the effects of a community-wide moderation intervention on Reddit. ACM Transactions on Computer-Human Interaction (TOCHI), 29(4): 1--26

work page 2022
[13]

Chandrasekharan, E.; Pavalanathan, U.; Srinivasan, A.; Glynn, A.; Eisenstein, J.; and Gilbert, E. 2017. You can't stay here: The efficacy of reddit's 2015 ban examined through hate speech. CSCW, 1(CSCW): 1--22

work page 2017
[14]

Cheng, Z.-c.; and Guo, T.-c. 2015. The formation of social identity and self-identity based on knowledge contribution in virtual communities: An inductive route model. Computers in Human Behavior, 43: 229--241

work page 2015
[15]

Cima, L.; Trujillo, A.; Avvenuti, M.; and Cresci, S. 2024. The great ban: Efficacy and unintended consequences of a massive deplatforming operation on reddit. In Companion Publication of the 16th ACM Web Science Conference, 85--93

work page 2024
[16]

Dettmers, T.; Pagnoni, A.; Holtzman, A.; and Zettlemoyer, L. 2023. Qlora: Efficient finetuning of quantized llms. Advances in neural information processing systems, 36: 10088--10115

work page 2023
[17]

S.; Carik, B.; Stil, S.; Wilhelm, L

Ding, X.; Ping, K.; Gunturi, U. S.; Carik, B.; Stil, S.; Wilhelm, L. T.; Daryanto, T.; Hawdon, J.; Lee, S. W.; and Rho, E. H. 2024. CounterQuill: Investigating the Potential of Human-AI Collaboration in Online Counterspeech Writing. arXiv preprint arXiv:2410.03032

work page arXiv 2024
[18]

Dinkar, T.; Jiang, A.; Frenda, S.; Gerrard-Abbott, P.; Gunson, N.; Abercrombie, G.; and Konstas, I. 2025. Can NLP Tackle Hate Speech in the Real World? Stakeholder-Informed Feedback and Survey on Counterspeech. arXiv preprint arXiv:2508.04638

work page arXiv 2025
[19]

M.; Kruglanski, A

Doosje, B.; Moghaddam, F. M.; Kruglanski, A. W.; De Wolf, A.; Mann, L.; and Feddes, A. R. 2016. Terrorism, radicalization and de-radicalization. Current Opinion in Psychology, 11: 79--84

work page 2016
[20]

Erickson, J.; and Yan, B. 2025. Content Moderation and Hate Speech on Alternative Platforms: A Case Study of BitChute. CSCW, 9(2): 1--18

work page 2025
[21]

FORCE11 . 2020. The FAIR Data principles. https://force11.org/info/the-fair-data-principles/

work page 2020
[22]

W.; Wallach, H.; Iii, H

Gebru, T.; Morgenstern, J.; Vecchione, B.; Vaughan, J. W.; Wallach, H.; Iii, H. D.; and Crawford, K. 2021. Datasheets for datasets. Communications of the ACM, 64(12): 86--92

work page 2021
[23]

Gelber, K.; and McNamara, L. 2016. Evidencing the harms of hate speech. Social Identities, 22(3): 324--341

work page 2016
[24]

A.; Haerter, V

Gennaro, G.; Derksen, L.; Abdelrahman, A.; Broggini, E.; Green, M. A.; Haerter, V. A.; Heer, E.; Heidler, I.; Kauer, F.; Kim, H.-N.; et al. 2025. Counterspeech encouraging users to adopt the perspective of minority groups reduces hate speech and its amplification on social media. Scientific Reports, 15(1): 22018

work page 2025
[25]

Gillespie, T. 2018. Custodians of the Internet: Platforms, content moderation, and the hidden decisions that shape social media. Yale University Press

work page 2018
[26]

Gligoric, K.; Cheng, M.; Zheng, L.; Durmus, E.; and Jurafsky, D. 2024. NLP Systems That Can't Tell Use from Mention Censor Counterspeech, but Teaching the Distinction Helps. arXiv preprint arXiv:2404.01651

work page arXiv 2024
[27]

Google Jigsaw . 2017. Perspective API

work page 2017
[28]

B.; Derksen, L.; Hall, A.; Jochum, M.; et al

Hangartner, D.; Gennaro, G.; Alasiri, S.; Bahrich, N.; Bornhoft, A.; Boucher, J.; Demirci, B. B.; Derksen, L.; Hall, A.; Jochum, M.; et al. 2021. Empathy-based counterspeech can reduce racist hate speech in a social media field experiment. PNAS, 118(50): e2116310118

work page 2021
[29]

J.; R \"a s \"a nen, P.; Zych, I.; Oksanen, A.; and Blaya, C

Hawdon, J.; Reichelmann, A.; Costello, M.; Llorent, V. J.; R \"a s \"a nen, P.; Zych, I.; Oksanen, A.; and Blaya, C. 2024. Measuring hate: Does a definition affect self-reported levels of perpetration and exposure to online hate in surveys? Social Science Computer Review, 42(3): 812--831

work page 2024
[30]

He, B.; Ziems, C.; Soni, S.; Ramakrishnan, N.; Yang, D.; and Kumar, S. 2021. Racism is a virus: Anti-Asian hate and counterspeech in social media during the COVID-19 crisis. In Proceedings of the 2021 IEEE/ACM international conference on advances in social networks analysis and mining, 90--94

work page 2021
[31]

M.; Lerman, K.; and Burghardt, K

Hickey, D.; Fessler, D. M.; Lerman, K.; and Burghardt, K. 2025 a . X under Musk’s leadership: Substantial hate and no reduction in inauthentic activity. PLoS One, 20(2): e0313293

work page 2025
[32]

M.; Schmitz, M.; Lerman, K.; and Burghardt, K

Hickey, D.; Fessler, D. M.; Schmitz, M.; Lerman, K.; and Burghardt, K. 2025 b . The peripatetic hater: predicting movement among hate subreddits. In ICWSM, volume 19, 786--803

work page 2025
[33]

Horta Ribeiro, M.; Hosseinmardi, H.; West, R.; and Watts, D. J. 2023. Deplatforming did not decrease Parler users’ activity on fringe social media. PNAS nexus, 2(3): pgad035

work page 2023
[34]

Horta Ribeiro, M.; Jhaver, S.; Zannettou, S.; Blackburn, J.; Stringhini, G.; De Cristofaro, E.; and West, R. 2021. Do platform migrations compromise content moderation? evidence from r/the\_donald and r/incels. CSCW, 5(CSCW2): 1--24

work page 2021
[35]

Hutto, C.; and Gilbert, E. 2014. Vader: A parsimonious rule-based model for sentiment analysis of social media text. In ICWSM, volume 8, 216--225

work page 2014
[36]

F.; Leahy, R.; Restrepo, N

Johnson, N. F.; Leahy, R.; Restrepo, N. J.; Vel \'a squez, N.; Zheng, M.; Manrique, P.; Devkota, P.; and Wuchty, S. 2019. Hidden resilience and adaptive dynamics of the global online hate ecology. Nature, 573(7773): 261--265

work page 2019
[37]

W.; Guess, A.; Nyhan, B.; and Reifler, J

Kim, J. W.; Guess, A.; Nyhan, B.; and Reifler, J. 2021. The distorting prism of social media: How self-selection and exposure to incivility fuel online comment toxicity. Journal of Communication, 71(6): 922--946

work page 2021
[38]

M.; Lewandowsky, S.; Hertwig, R.; Lorenz-Spreen, P.; Leiser, M.; and Reifler, J

Kozyreva, A.; Herzog, S. M.; Lewandowsky, S.; Hertwig, R.; Lorenz-Spreen, P.; Leiser, M.; and Reifler, J. 2023. Resolving content moderation dilemmas between free speech and harmful misinformation. PNAS, 120(7): e2210666120

work page 2023
[39]

Kumarswamy, N.; Singhal, M.; and Nilizadeh, S. 2025. Causal Insights into Parler's Content Moderation Shift: Effects on Toxicity and Factuality. In Proceedings of the ACM on Web Conference 2025, 3762--3771

work page 2025
[40]

A.; and Boyd, R

Lahnala, A.; Varadarajan, V.; Flek, L.; Schwartz, H. A.; and Boyd, R. L. 2025. Unifying the Extremes: Developing a Unified Model for Detecting and Predicting Extremist Traits and Radicalization. In ICWSM, volume 19, 1051--1067

work page 2025
[41]

Liu, Y.; Ott, M.; Goyal, N.; Du, J.; Joshi, M.; Chen, D.; Levy, O.; Lewis, M.; Zettlemoyer, L.; and Stoyanov, V. 2019. Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692

work page internal anchor Pith review Pith/arXiv arXiv 2019
[42]

Luhrmann, T. M. 1991. Persuasions of the witch’s craft: Ritual magic in contemporary England. Harvard University Press

work page 1991
[43]

J.; Goldberg, B.; and Johnson, N

Lupu, Y.; Sear, R.; Vel \'a squez, N.; Leahy, R.; Restrepo, N. J.; Goldberg, B.; and Johnson, N. F. 2023. Offline events and online hate. PLoS one, 18(1): e0278511

work page 2023
[44]

Mann, D.; Sutton, M.; and Tuffin, R. 2003. The evolution of hate: social dynamics in white racist newsgroups. Internet Journal of Criminology

work page 2003
[45]

Marwick, A. E. 2021. Morally motivated networked harassment as normative reinforcement. Social Media+ Society, 7(2): 20563051211021378

work page 2021
[46]

Mathew, B.; Illendula, A.; Saha, P.; Sarkar, S.; Goyal, P.; and Mukherjee, A. 2020. Hate begets hate: A temporal study of hate speech. CSCW, 4(CSCW2): 1--24

work page 2020
[47]

K.; Goyal, P.; and Mukherjee, A

Mathew, B.; Saha, P.; Tharad, H.; Rajgaria, S.; Singhania, P.; Maity, S. K.; Goyal, P.; and Mukherjee, A. 2019. Thou shalt not hate: Countering online hate speech. In ICWSM, volume 13, 369--380

work page 2019
[48]

Mekacher, A.; Falkenberg, M.; and Baronchelli, A. 2023. The systemic impact of deplatforming on social media. PNAS nexus, 2(11): pgad346

work page 2023
[49]

Mun, J.; Allaway, E.; Yerukola, A.; Vianna, L.; Leslie, S.-J.; and Sap, M. 2023. Beyond Denouncing Hate: Strategies for Countering Implied Biases and Stereotypes in Language. In Findings of the Association for Computational Linguistics: EMNLP 2023, 9759--9777

work page 2023
[50]

Nattino, G.; Lu, B.; Shi, J.; Lemeshow, S.; and Xiang, H. 2021. Triplet matching for estimating causal effects with three treatment arms: a comparative study of mortality by trauma center level. Journal of the American Statistical Association, 116(533): 44--53

work page 2021
[51]

Ping, K.; Hawdon, J.; and Rho, E. H. 2025. Perceiving and countering hate: The role of identity in online responses. CSCW, 9(2): 1--28

work page 2025
[52]

Russo, G.; Horta Ribeiro, M.; Casiraghi, G.; and Verginer, L. 2023 a . Understanding online migration decisions following the banning of radical communities. In Proceedings of the 15th ACM Web Science Conference 2023, 251--259

work page 2023
[53]

H.; and West, R

Russo, G.; Ribeiro, M. H.; and West, R. 2024. Stranger danger! cross-community interactions with fringe users increase the growth of fringe communities on reddit. In ICWSM, volume 18, 1342--1353

work page 2024
[54]

H.; and Casiraghi, G

Russo, G.; Verginer, L.; Ribeiro, M. H.; and Casiraghi, G. 2023 b . Spillover of antisocial behavior from fringe platforms: The unintended consequences of community banning. In ICWSM, volume 17, 742--753

work page 2023
[55]

Sap, M.; Swayamdipta, S.; Vianna, L.; Zhou, X.; Choi, Y.; and Smith, N. A. 2022. Annotators with attitudes: How annotator beliefs and identities bias toxic language detection. In Proceedings of the 2022 conference of the north american chapter of the association for computational linguistics: Human language technologies, 5884--5906

work page 2022
[56]

Saveski, M.; Roy, B.; and Roy, D. 2021. The structure of toxic conversations on Twitter. In Proceedings of the web conference 2021, 1086--1097

work page 2021
[57]

Schmid, U. K. 2025. Humorous hate speech on social media: A mixed-methods investigation of users’ perceptions and processing of hateful memes. New Media & Society, 27(3): 1588--1606

work page 2025
[58]

K.; Schulze, H.; and Drexel, A

Schmid, U. K.; Schulze, H.; and Drexel, A. 2025. Memes, humor, and the far right’s strategic mainstreaming. Information, Communication & Society, 28(4): 537--556

work page 2025
[59]

Senaviratna, N.; Cooray, T.; et al. 2019. Diagnosing multicollinearity of logistic regression model. Asian J. Probab. Stat., 5(2): 1--9

work page 2019
[60]

Shen, Q.; and Ros \'e , C. P. 2022. A tale of two subreddits: Measuring the impacts of quarantines on political engagement on Reddit. In ICWSM, volume 16, 932--943

work page 2022
[61]

Song, K.; Tan, X.; Qin, T.; Lu, J.; and Liu, T.-Y. 2020. Mpnet: Masked and permuted pre-training for language understanding. Advances in neural information processing systems, 33: 16857--16867

work page 2020
[62]

Song, X.; Mamidisetty, S.; Blanco, E.; and Hong, L. 2024. Assessing the human likeness of AI-generated counterspeech. arXiv preprint arXiv:2410.11007

work page arXiv 2024
[63]

L.; Yu, X.; Blanco, E.; and Hong, L

Song, X.; Perez, S. L.; Yu, X.; Blanco, E.; and Hong, L. 2025. Echoes of Discord: Forecasting Hater Reactions to Counterspeech. arXiv preprint arXiv:2501.16235

work page arXiv 2025
[64]

Suhay, E.; Bello-Pardo, E.; and Maurer, B. 2018. The polarizing effects of online partisan criticism: Evidence from two experiments. The International Journal of Press/Politics, 23(1): 95--115

work page 2018
[65]

Trujillo, A.; and Cresci, S. 2022. Make reddit great again: assessing community effects of moderation interventions on r/the\_donald. CSCW, 6(CSCW2): 1--28

work page 2022
[66]

Vidgen, B.; Nguyen, D.; Margetts, H.; Rossini, P.; and Tromble, R. 2021. Introducing CAD : the Contextual Abuse Dataset. In Toutanova, K.; Rumshisky, A.; Zettlemoyer, L.; Hakkani-Tur, D.; Beltagy, I.; Bethard, S.; Cotterell, R.; Chakraborty, T.; and Zhou, Y., eds., Proceedings of the 2021 Conference of the North American Chapter of the Association for Com...

work page 2021
[67]

Waller, I.; and Anderson, A. 2021. Quantifying social organization and political polarization in online platforms. Nature, 600(7888): 264--268

work page 2021
[68]

Walther, J. B. 2024. The effects of social approval signals on the production of online hate: A theoretical explication. Communication Research, 00936502241278944

work page 2024
[69]

Xia, Y.; Monti, C.; Keller, B.; and Kivel \"a , M. 2025. Integrated or Segregated? User Behavior Change After Cross-Party Interactions on Reddit. In ICWSM, volume 19, 2044--2061

work page 2025
[70]

Yu, X.; Blanco, E.; and Hong, L. 2022. Hate speech and counter speech detection: Conversational context does matter. arXiv preprint arXiv:2206.06423

work page arXiv 2022
[71]

Yu, X.; Blanco, E.; and Hong, L. 2024. Hate cannot drive out hate: Forecasting conversation incivility following replies to hate speech. In ICWSM, volume 18, 1740--1752

work page 2024
[72]

Yu, X.; Zhao, A.; Blanco, E.; and Hong, L. 2023. A fine-grained taxonomy of replies to hate speech. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 7275--7289

work page 2023
[73]

Ziems, C.; Held, W.; Shaikh, O.; Chen, J.; Zhang, Z.; and Yang, D. 2024. Can large language models transform computational social science? Computational Linguistics, 50(1): 237--291

work page 2024

[1] [1]

, " * write output.state after.block = add.period write newline

ENTRY address archivePrefix author booktitle chapter edition editor eid eprint howpublished institution isbn journal key month note number organization pages publisher school series title type volume year label extra.label sort.label short.list INTEGERS output.state before.all mid.sentence after.sentence after.block FUNCTION init.state.consts #0 'before.a...

work page

[2] [2]

write newline

" write newline "" before.all 'output.state := FUNCTION n.dashify 't := "" t empty not t #1 #1 substring "-" = t #1 #2 substring "--" = not "--" * t #2 global.max substring 't := t #1 #1 substring "-" = "-" * t #2 global.max substring 't := while if t #1 #1 substring * t #2 global.max substring 't := if while FUNCTION word.in bbl.in capitalize " " * FUNCT...

work page

[3] [3]

Alyahya, G.; and Aldayel, A. 2025. Hatred stems from ignorance! distillation of the persuasion modes in countering conversational hate speech. In ICWSM, volume 19, 52--67

work page 2025

[4] [4]

Austin, P. C. 2011. An introduction to propensity score methods for reducing the effects of confounding in observational studies. Multivariate behavioral research, 46(3): 399--424

work page 2011

[5] [5]

Baider, F. 2023. Accountability issues, online covert hate speech, and the efficacy of counter-speech. Politics and Governance, 11(2): 249--260

work page 2023

[6] [6]

A.; Argyle, L

Bail, C. A.; Argyle, L. P.; Brown, T. W.; Bumpus, J. P.; Chen, H.; Hunzaker, M. F.; Lee, J.; Mann, M.; Merhout, F.; and Volfovsky, A. 2018. Exposure to opposing views on social media can increase political polarization. PNAS, 115(37): 9216--9221

work page 2018

[7] [7]

B \"a r, D.; Maarouf, A.; and Feuerriegel, S. 2024. Generative AI may backfire for counterspeech. arXiv preprint arXiv:2411.14986

work page arXiv 2024

[8] [8]

Baumgartner, J.; Zannettou, S.; Keegan, B.; Squire, M.; and Blackburn, J. 2020. The pushshift reddit dataset. In ICWSM, volume 14, 830--839

work page 2020

[9] [9]

Beknazar-Yuzbashev, G.; Jim \'e nez-Dur \'a n, R.; McCrosky, J.; and Stalinski, M. 2025. Toxic content and user engagement on social media: Evidence from a field experiment. Technical report, CESifo Working Paper

work page 2025

[10] [10]

Buntain, C.; Innes, M.; Mitts, T.; and Shapiro, J. 2023. Cross-platform reactions to the post-January 6 deplatforming. Journal of Quantitative Description: Digital Media, 3

work page 2023

[11] [11]

Chan, J.; Ghose, A.; and Seamans, R. 2016. The internet and racial hate crime. Mis Quarterly, 40(2): 381--404

work page 2016

[12] [12]

Chandrasekharan, E.; Jhaver, S.; Bruckman, A.; and Gilbert, E. 2022. Quarantined! Examining the effects of a community-wide moderation intervention on Reddit. ACM Transactions on Computer-Human Interaction (TOCHI), 29(4): 1--26

work page 2022

[13] [13]

Chandrasekharan, E.; Pavalanathan, U.; Srinivasan, A.; Glynn, A.; Eisenstein, J.; and Gilbert, E. 2017. You can't stay here: The efficacy of reddit's 2015 ban examined through hate speech. CSCW, 1(CSCW): 1--22

work page 2017

[14] [14]

Cheng, Z.-c.; and Guo, T.-c. 2015. The formation of social identity and self-identity based on knowledge contribution in virtual communities: An inductive route model. Computers in Human Behavior, 43: 229--241

work page 2015

[15] [15]

Cima, L.; Trujillo, A.; Avvenuti, M.; and Cresci, S. 2024. The great ban: Efficacy and unintended consequences of a massive deplatforming operation on reddit. In Companion Publication of the 16th ACM Web Science Conference, 85--93

work page 2024

[16] [16]

Dettmers, T.; Pagnoni, A.; Holtzman, A.; and Zettlemoyer, L. 2023. Qlora: Efficient finetuning of quantized llms. Advances in neural information processing systems, 36: 10088--10115

work page 2023

[17] [17]

S.; Carik, B.; Stil, S.; Wilhelm, L

Ding, X.; Ping, K.; Gunturi, U. S.; Carik, B.; Stil, S.; Wilhelm, L. T.; Daryanto, T.; Hawdon, J.; Lee, S. W.; and Rho, E. H. 2024. CounterQuill: Investigating the Potential of Human-AI Collaboration in Online Counterspeech Writing. arXiv preprint arXiv:2410.03032

work page arXiv 2024

[18] [18]

Dinkar, T.; Jiang, A.; Frenda, S.; Gerrard-Abbott, P.; Gunson, N.; Abercrombie, G.; and Konstas, I. 2025. Can NLP Tackle Hate Speech in the Real World? Stakeholder-Informed Feedback and Survey on Counterspeech. arXiv preprint arXiv:2508.04638

work page arXiv 2025

[19] [19]

M.; Kruglanski, A

Doosje, B.; Moghaddam, F. M.; Kruglanski, A. W.; De Wolf, A.; Mann, L.; and Feddes, A. R. 2016. Terrorism, radicalization and de-radicalization. Current Opinion in Psychology, 11: 79--84

work page 2016

[20] [20]

Erickson, J.; and Yan, B. 2025. Content Moderation and Hate Speech on Alternative Platforms: A Case Study of BitChute. CSCW, 9(2): 1--18

work page 2025

[21] [21]

FORCE11 . 2020. The FAIR Data principles. https://force11.org/info/the-fair-data-principles/

work page 2020

[22] [22]

W.; Wallach, H.; Iii, H

Gebru, T.; Morgenstern, J.; Vecchione, B.; Vaughan, J. W.; Wallach, H.; Iii, H. D.; and Crawford, K. 2021. Datasheets for datasets. Communications of the ACM, 64(12): 86--92

work page 2021

[23] [23]

Gelber, K.; and McNamara, L. 2016. Evidencing the harms of hate speech. Social Identities, 22(3): 324--341

work page 2016

[24] [24]

A.; Haerter, V

Gennaro, G.; Derksen, L.; Abdelrahman, A.; Broggini, E.; Green, M. A.; Haerter, V. A.; Heer, E.; Heidler, I.; Kauer, F.; Kim, H.-N.; et al. 2025. Counterspeech encouraging users to adopt the perspective of minority groups reduces hate speech and its amplification on social media. Scientific Reports, 15(1): 22018

work page 2025

[25] [25]

Gillespie, T. 2018. Custodians of the Internet: Platforms, content moderation, and the hidden decisions that shape social media. Yale University Press

work page 2018

[26] [26]

Gligoric, K.; Cheng, M.; Zheng, L.; Durmus, E.; and Jurafsky, D. 2024. NLP Systems That Can't Tell Use from Mention Censor Counterspeech, but Teaching the Distinction Helps. arXiv preprint arXiv:2404.01651

work page arXiv 2024

[27] [27]

Google Jigsaw . 2017. Perspective API

work page 2017

[28] [28]

B.; Derksen, L.; Hall, A.; Jochum, M.; et al

Hangartner, D.; Gennaro, G.; Alasiri, S.; Bahrich, N.; Bornhoft, A.; Boucher, J.; Demirci, B. B.; Derksen, L.; Hall, A.; Jochum, M.; et al. 2021. Empathy-based counterspeech can reduce racist hate speech in a social media field experiment. PNAS, 118(50): e2116310118

work page 2021

[29] [29]

J.; R \"a s \"a nen, P.; Zych, I.; Oksanen, A.; and Blaya, C

Hawdon, J.; Reichelmann, A.; Costello, M.; Llorent, V. J.; R \"a s \"a nen, P.; Zych, I.; Oksanen, A.; and Blaya, C. 2024. Measuring hate: Does a definition affect self-reported levels of perpetration and exposure to online hate in surveys? Social Science Computer Review, 42(3): 812--831

work page 2024

[30] [30]

He, B.; Ziems, C.; Soni, S.; Ramakrishnan, N.; Yang, D.; and Kumar, S. 2021. Racism is a virus: Anti-Asian hate and counterspeech in social media during the COVID-19 crisis. In Proceedings of the 2021 IEEE/ACM international conference on advances in social networks analysis and mining, 90--94

work page 2021

[31] [31]

M.; Lerman, K.; and Burghardt, K

Hickey, D.; Fessler, D. M.; Lerman, K.; and Burghardt, K. 2025 a . X under Musk’s leadership: Substantial hate and no reduction in inauthentic activity. PLoS One, 20(2): e0313293

work page 2025

[32] [32]

M.; Schmitz, M.; Lerman, K.; and Burghardt, K

Hickey, D.; Fessler, D. M.; Schmitz, M.; Lerman, K.; and Burghardt, K. 2025 b . The peripatetic hater: predicting movement among hate subreddits. In ICWSM, volume 19, 786--803

work page 2025

[33] [33]

Horta Ribeiro, M.; Hosseinmardi, H.; West, R.; and Watts, D. J. 2023. Deplatforming did not decrease Parler users’ activity on fringe social media. PNAS nexus, 2(3): pgad035

work page 2023

[34] [34]

Horta Ribeiro, M.; Jhaver, S.; Zannettou, S.; Blackburn, J.; Stringhini, G.; De Cristofaro, E.; and West, R. 2021. Do platform migrations compromise content moderation? evidence from r/the\_donald and r/incels. CSCW, 5(CSCW2): 1--24

work page 2021

[35] [35]

Hutto, C.; and Gilbert, E. 2014. Vader: A parsimonious rule-based model for sentiment analysis of social media text. In ICWSM, volume 8, 216--225

work page 2014

[36] [36]

F.; Leahy, R.; Restrepo, N

Johnson, N. F.; Leahy, R.; Restrepo, N. J.; Vel \'a squez, N.; Zheng, M.; Manrique, P.; Devkota, P.; and Wuchty, S. 2019. Hidden resilience and adaptive dynamics of the global online hate ecology. Nature, 573(7773): 261--265

work page 2019

[37] [37]

W.; Guess, A.; Nyhan, B.; and Reifler, J

Kim, J. W.; Guess, A.; Nyhan, B.; and Reifler, J. 2021. The distorting prism of social media: How self-selection and exposure to incivility fuel online comment toxicity. Journal of Communication, 71(6): 922--946

work page 2021

[38] [38]

M.; Lewandowsky, S.; Hertwig, R.; Lorenz-Spreen, P.; Leiser, M.; and Reifler, J

Kozyreva, A.; Herzog, S. M.; Lewandowsky, S.; Hertwig, R.; Lorenz-Spreen, P.; Leiser, M.; and Reifler, J. 2023. Resolving content moderation dilemmas between free speech and harmful misinformation. PNAS, 120(7): e2210666120

work page 2023

[39] [39]

Kumarswamy, N.; Singhal, M.; and Nilizadeh, S. 2025. Causal Insights into Parler's Content Moderation Shift: Effects on Toxicity and Factuality. In Proceedings of the ACM on Web Conference 2025, 3762--3771

work page 2025

[40] [40]

A.; and Boyd, R

Lahnala, A.; Varadarajan, V.; Flek, L.; Schwartz, H. A.; and Boyd, R. L. 2025. Unifying the Extremes: Developing a Unified Model for Detecting and Predicting Extremist Traits and Radicalization. In ICWSM, volume 19, 1051--1067

work page 2025

[41] [41]

Liu, Y.; Ott, M.; Goyal, N.; Du, J.; Joshi, M.; Chen, D.; Levy, O.; Lewis, M.; Zettlemoyer, L.; and Stoyanov, V. 2019. Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692

work page internal anchor Pith review Pith/arXiv arXiv 2019

[42] [42]

Luhrmann, T. M. 1991. Persuasions of the witch’s craft: Ritual magic in contemporary England. Harvard University Press

work page 1991

[43] [43]

J.; Goldberg, B.; and Johnson, N

Lupu, Y.; Sear, R.; Vel \'a squez, N.; Leahy, R.; Restrepo, N. J.; Goldberg, B.; and Johnson, N. F. 2023. Offline events and online hate. PLoS one, 18(1): e0278511

work page 2023

[44] [44]

Mann, D.; Sutton, M.; and Tuffin, R. 2003. The evolution of hate: social dynamics in white racist newsgroups. Internet Journal of Criminology

work page 2003

[45] [45]

Marwick, A. E. 2021. Morally motivated networked harassment as normative reinforcement. Social Media+ Society, 7(2): 20563051211021378

work page 2021

[46] [46]

Mathew, B.; Illendula, A.; Saha, P.; Sarkar, S.; Goyal, P.; and Mukherjee, A. 2020. Hate begets hate: A temporal study of hate speech. CSCW, 4(CSCW2): 1--24

work page 2020

[47] [47]

K.; Goyal, P.; and Mukherjee, A

Mathew, B.; Saha, P.; Tharad, H.; Rajgaria, S.; Singhania, P.; Maity, S. K.; Goyal, P.; and Mukherjee, A. 2019. Thou shalt not hate: Countering online hate speech. In ICWSM, volume 13, 369--380

work page 2019

[48] [48]

Mekacher, A.; Falkenberg, M.; and Baronchelli, A. 2023. The systemic impact of deplatforming on social media. PNAS nexus, 2(11): pgad346

work page 2023

[49] [49]

Mun, J.; Allaway, E.; Yerukola, A.; Vianna, L.; Leslie, S.-J.; and Sap, M. 2023. Beyond Denouncing Hate: Strategies for Countering Implied Biases and Stereotypes in Language. In Findings of the Association for Computational Linguistics: EMNLP 2023, 9759--9777

work page 2023

[50] [50]

Nattino, G.; Lu, B.; Shi, J.; Lemeshow, S.; and Xiang, H. 2021. Triplet matching for estimating causal effects with three treatment arms: a comparative study of mortality by trauma center level. Journal of the American Statistical Association, 116(533): 44--53

work page 2021

[51] [51]

Ping, K.; Hawdon, J.; and Rho, E. H. 2025. Perceiving and countering hate: The role of identity in online responses. CSCW, 9(2): 1--28

work page 2025

[52] [52]

Russo, G.; Horta Ribeiro, M.; Casiraghi, G.; and Verginer, L. 2023 a . Understanding online migration decisions following the banning of radical communities. In Proceedings of the 15th ACM Web Science Conference 2023, 251--259

work page 2023

[53] [53]

H.; and West, R

Russo, G.; Ribeiro, M. H.; and West, R. 2024. Stranger danger! cross-community interactions with fringe users increase the growth of fringe communities on reddit. In ICWSM, volume 18, 1342--1353

work page 2024

[54] [54]

H.; and Casiraghi, G

Russo, G.; Verginer, L.; Ribeiro, M. H.; and Casiraghi, G. 2023 b . Spillover of antisocial behavior from fringe platforms: The unintended consequences of community banning. In ICWSM, volume 17, 742--753

work page 2023

[55] [55]

Sap, M.; Swayamdipta, S.; Vianna, L.; Zhou, X.; Choi, Y.; and Smith, N. A. 2022. Annotators with attitudes: How annotator beliefs and identities bias toxic language detection. In Proceedings of the 2022 conference of the north american chapter of the association for computational linguistics: Human language technologies, 5884--5906

work page 2022

[56] [56]

Saveski, M.; Roy, B.; and Roy, D. 2021. The structure of toxic conversations on Twitter. In Proceedings of the web conference 2021, 1086--1097

work page 2021

[57] [57]

Schmid, U. K. 2025. Humorous hate speech on social media: A mixed-methods investigation of users’ perceptions and processing of hateful memes. New Media & Society, 27(3): 1588--1606

work page 2025

[58] [58]

K.; Schulze, H.; and Drexel, A

Schmid, U. K.; Schulze, H.; and Drexel, A. 2025. Memes, humor, and the far right’s strategic mainstreaming. Information, Communication & Society, 28(4): 537--556

work page 2025

[59] [59]

Senaviratna, N.; Cooray, T.; et al. 2019. Diagnosing multicollinearity of logistic regression model. Asian J. Probab. Stat., 5(2): 1--9

work page 2019

[60] [60]

Shen, Q.; and Ros \'e , C. P. 2022. A tale of two subreddits: Measuring the impacts of quarantines on political engagement on Reddit. In ICWSM, volume 16, 932--943

work page 2022

[61] [61]

Song, K.; Tan, X.; Qin, T.; Lu, J.; and Liu, T.-Y. 2020. Mpnet: Masked and permuted pre-training for language understanding. Advances in neural information processing systems, 33: 16857--16867

work page 2020

[62] [62]

Song, X.; Mamidisetty, S.; Blanco, E.; and Hong, L. 2024. Assessing the human likeness of AI-generated counterspeech. arXiv preprint arXiv:2410.11007

work page arXiv 2024

[63] [63]

L.; Yu, X.; Blanco, E.; and Hong, L

Song, X.; Perez, S. L.; Yu, X.; Blanco, E.; and Hong, L. 2025. Echoes of Discord: Forecasting Hater Reactions to Counterspeech. arXiv preprint arXiv:2501.16235

work page arXiv 2025

[64] [64]

Suhay, E.; Bello-Pardo, E.; and Maurer, B. 2018. The polarizing effects of online partisan criticism: Evidence from two experiments. The International Journal of Press/Politics, 23(1): 95--115

work page 2018

[65] [65]

Trujillo, A.; and Cresci, S. 2022. Make reddit great again: assessing community effects of moderation interventions on r/the\_donald. CSCW, 6(CSCW2): 1--28

work page 2022

[66] [66]

Vidgen, B.; Nguyen, D.; Margetts, H.; Rossini, P.; and Tromble, R. 2021. Introducing CAD : the Contextual Abuse Dataset. In Toutanova, K.; Rumshisky, A.; Zettlemoyer, L.; Hakkani-Tur, D.; Beltagy, I.; Bethard, S.; Cotterell, R.; Chakraborty, T.; and Zhou, Y., eds., Proceedings of the 2021 Conference of the North American Chapter of the Association for Com...

work page 2021

[67] [67]

Waller, I.; and Anderson, A. 2021. Quantifying social organization and political polarization in online platforms. Nature, 600(7888): 264--268

work page 2021

[68] [68]

Walther, J. B. 2024. The effects of social approval signals on the production of online hate: A theoretical explication. Communication Research, 00936502241278944

work page 2024

[69] [69]

Xia, Y.; Monti, C.; Keller, B.; and Kivel \"a , M. 2025. Integrated or Segregated? User Behavior Change After Cross-Party Interactions on Reddit. In ICWSM, volume 19, 2044--2061

work page 2025

[70] [70]

Yu, X.; Blanco, E.; and Hong, L. 2022. Hate speech and counter speech detection: Conversational context does matter. arXiv preprint arXiv:2206.06423

work page arXiv 2022

[71] [71]

Yu, X.; Blanco, E.; and Hong, L. 2024. Hate cannot drive out hate: Forecasting conversation incivility following replies to hate speech. In ICWSM, volume 18, 1740--1752

work page 2024

[72] [72]

Yu, X.; Zhao, A.; Blanco, E.; and Hong, L. 2023. A fine-grained taxonomy of replies to hate speech. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 7275--7289

work page 2023

[73] [73]

Ziems, C.; Held, W.; Shaikh, O.; Chen, J.; Zhang, Z.; and Yang, D. 2024. Can large language models transform computational social science? Computational Linguistics, 50(1): 237--291

work page 2024