pith. sign in

arxiv: 2506.05086 · v2 · submitted 2025-06-05 · 💻 cs.SI · cs.CY

Among Us: Language of Conspiracy Theorists on Mainstream Reddit

Pith reviewed 2026-05-19 11:18 UTC · model grok-4.3

classification 💻 cs.SI cs.CY
keywords conspiracy theoriesredditlinguistic patternsmachine learningsocial mediafringe communitiesonline discourseclassification tasks
0
0 comments X

The pith

Users active in conspiracy subreddits display distinctive language patterns even in mainstream Reddit communities that per-community machine learning models can identify at 87 percent average accuracy.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper investigates whether people who post in conspiracy-focused Reddit groups write differently when they participate in ordinary forums such as news, humor, or hobby discussions. It draws on a dataset of over 500 million comments across ten years to compare the language of these users against others within the same communities. Machine learning classifiers trained on linguistic features succeed in separating the two groups with high reliability, yet the best performance comes only when each community gets its own model rather than a single model applied everywhere. The result indicates that the linguistic differences are real but shift according to local social expectations. This points toward the need for detection and moderation tools that adapt to each specific online space instead of relying on one-size-fits-all approaches.

Core claim

Users who participate in conspiracy-focused subreddits exhibit distinctive linguistic patterns when they post in general-interest Reddit communities. These patterns allow binary machine learning classifiers to distinguish them from other users within individual communities at an average accuracy of 87 percent across more than twenty tasks. Community-specific models outperform any single global classifier by as much as 17 percentage points. The authors interpret this gap as evidence that linguistic expression among these users remains dynamic and responsive to the norms of each environment rather than fixed across all contexts.

What carries the argument

Per-community binary classifiers trained on linguistic features from user comments to separate conspiracy-active users from the rest of a given subreddit.

If this is right

  • Uniform moderation strategies across Reddit will miss many signals because linguistic markers vary by community.
  • Detection systems must be retrained or adapted separately for each subreddit to maintain high accuracy.
  • Linguistic differences among these users persist across news, humor, and hobby contexts rather than appearing only inside conspiracy spaces.
  • Global models trained on pooled data lose up to 17 points of accuracy compared with localized ones.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same approach could be applied to users from other fringe or activist groups to test whether distinctive language appears outside their dedicated forums.
  • Longitudinal tracking of individual users might reveal whether these linguistic markers appear before or after joining conspiracy communities.
  • Platform interventions could prioritize community-specific training data collection to improve early identification without over-flagging general discussion.

Load-bearing premise

Participation in conspiracy subreddits marks a distinct user population whose language differences in other communities arise independently of the specific topics under discussion.

What would settle it

A test that trains classifiers on linguistic features while matching or controlling for the exact topics discussed in comments and finds that accuracy falls close to random guessing would falsify the claim of topic-independent signatures.

read the original abstract

The interaction between fringe subcultures and mainstream online communities poses significant challenges for understanding discourse on social media. In this work, we investigate whether users active in conspiracy-focused communities exhibit detectable linguistic signatures when participating in general-interest spaces, such as news, humor, or hobbyist forums. We analyze a large-scale longitudinal dataset of over 500 million comments spanning 10 years of Reddit activity, examining the communication patterns of these users across diverse social contexts independent of the topics they discuss. We show that these users exhibit distinctive linguistic patterns that enable machine learning models to reliably distinguish them from the general population within individual communities (averaging 87\% accuracy across more than 20 binary classification tasks). Crucially, no single aggregate model captures these patterns across communities, as community-specific models outperform global classifiers by up to 17 percentage points. This result suggests that while these users are distinct, their linguistic expression is dynamic and highly responsive to the social norms of the environment they inhabit. Our findings suggest the need for tailored interventions in online spaces, as linguistic signals associated with conspiracy and fringe subcultures vary across communities and cannot be effectively addressed by uniform detection or moderation strategies.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The paper claims that users active in conspiracy-focused subreddits exhibit distinctive linguistic patterns in mainstream Reddit communities (news, humor, hobbyist forums) that enable ML models to distinguish them from the general population with an average 87% accuracy across more than 20 per-community binary classification tasks. It further claims that community-specific models outperform global classifiers by up to 17 points, indicating that linguistic expression is dynamic and responsive to local social norms rather than fixed or topic-driven.

Significance. If the central result holds after proper controls, the work would be significant for computational social science by providing large-scale longitudinal evidence (500M+ comments over 10 years) that fringe-group linguistic markers are context-dependent. This would support the need for community-tailored moderation strategies and demonstrate the value of comparing local versus aggregate classifiers for detecting norm-responsive idiolects.

major comments (2)
  1. [Abstract] Abstract: The claim that detected patterns are 'independent of the topics they discuss' and reflect 'dynamic and highly responsive' linguistic signatures is load-bearing for the central result but lacks described controls. The user-identification method (conspiracy-subreddit participation) followed by classification on mainstream comments does not mention topic matching, keyword filtering, LDA-based content removal, or restriction to function-word features; without these, the 87% accuracy may exploit residual domain vocabulary rather than style.
  2. [Abstract] Abstract and results sections: The reported classification accuracies are presented without details on feature engineering, baseline comparisons (e.g., bag-of-words vs. LIWC vs. embeddings), statistical significance testing, or confounds such as differential post volume, user tenure, or demographic proxies. These omissions make it impossible to evaluate whether the per-community advantage over global models (up to 17 points) is robust or an artifact of unbalanced data.
minor comments (1)
  1. [Abstract] Abstract: The statement 'averaging 87% accuracy across more than 20 binary classification tasks' would be clearer if it reported the range, standard deviation, or per-community breakdown to allow assessment of consistency.

Simulated Author's Rebuttal

2 responses · 1 unresolved

We thank the referee for their careful reading and constructive feedback. The comments identify important areas for clarification regarding controls for topic independence and experimental details. We address each point below and commit to revisions that strengthen the manuscript without altering its core claims.

read point-by-point responses
  1. Referee: [Abstract] Abstract: The claim that detected patterns are 'independent of the topics they discuss' and reflect 'dynamic and highly responsive' linguistic signatures is load-bearing for the central result but lacks described controls. The user-identification method (conspiracy-subreddit participation) followed by classification on mainstream comments does not mention topic matching, keyword filtering, LDA-based content removal, or restriction to function-word features; without these, the 87% accuracy may exploit residual domain vocabulary rather than style.

    Authors: We agree that the abstract would benefit from explicit reference to the controls used. The full manuscript selects mainstream comments exclusively from non-conspiracy subreddits (news, humor, and hobbyist forums) and focuses analysis on stylistic patterns. In revision we will expand the abstract and add a methods subsection detailing keyword filtering to exclude conspiracy-related terms, LDA-based verification that topic distributions are matched between groups, and primary reliance on function-word and syntactic features to isolate style from content. These additions will directly support the independence claim. revision: yes

  2. Referee: [Abstract] Abstract and results sections: The reported classification accuracies are presented without details on feature engineering, baseline comparisons (e.g., bag-of-words vs. LIWC vs. embeddings), statistical significance testing, or confounds such as differential post volume, user tenure, or demographic proxies. These omissions make it impossible to evaluate whether the per-community advantage over global models (up to 17 points) is robust or an artifact of unbalanced data.

    Authors: We accept that greater methodological transparency is required. The revised manuscript will include an expanded methods section describing the full feature set, explicit baseline comparisons (bag-of-words, LIWC, and embeddings), and statistical tests (e.g., McNemar’s test) for accuracy differences between community-specific and global models. We will also report controls for post volume and user tenure via matching or covariate adjustment. Demographic proxies cannot be addressed because the Reddit dataset contains no such information; we will note this limitation explicitly. revision: partial

standing simulated objections not resolved
  • Demographic proxies cannot be controlled because the underlying Reddit data provides no user demographic information.

Circularity Check

0 steps flagged

No circularity: empirical classification results are independent of inputs

full rationale

The paper conducts a standard empirical study: users are labeled by participation in conspiracy-focused subreddits, features are extracted from their separate mainstream Reddit comments, and binary classifiers are trained and evaluated on held-out data to produce the reported per-community accuracies. No equations, parameters, or predictions are defined in terms of the target result itself, no self-citation chain is load-bearing for the central claim, and the accuracies are direct outputs of the ML pipeline rather than renamings or fitted inputs. The work is therefore self-contained against external benchmarks and exhibits no reduction of results to inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim depends on the domain assumption that subreddit participation serves as a reliable proxy for conspiracy orientation and that observed linguistic differences can be isolated from topical content; no free parameters or invented entities are explicitly introduced in the abstract.

axioms (1)
  • domain assumption Participation in conspiracy-focused subreddits reliably identifies users with a distinct linguistic profile independent of discussion topics
    This labeling choice underpins the binary classification tasks described in the abstract.

pith-pipeline@v0.9.0 · 5740 in / 1306 out tokens · 55601 ms · 2026-05-19T11:18:50.474637+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Effects of Algorithmic Visibility on Conspiracy Communities: Reddit after Epstein's 'Suicide'

    cs.CY 2025-12 unverdicted novelty 5.0

    Mainstream visibility after Epstein's death selected for users who stayed less and talked less like core members, pointing to selection rather than simple amplification in conspiracy community growth.

  2. Simulating Online Social Media Conversations on Controversial Topics Using AI Agents Calibrated on Real-World Data

    cs.SI 2025-09 conditional novelty 5.0

    LLM agents calibrated on Italian election data produce coherent posts and realistic network structure but show less tone and toxicity variation than real users, with opinion changes resembling traditional mathematical models.

Reference graph

Works this paper leans on

71 extracted references · 71 canonical work pages · cited by 2 Pith papers

  1. [1]

    Douglas, K. M.et al. Understanding Conspiracy Theories.Political Psychology 40, 3–35 (2019). URL https://onlinelibrary.wiley.com/doi/abs/10.1111/pops. 12568. _eprint: https://onlinelibrary.wiley.com/doi/pdf/10.1111/pops.12568

  2. [2]

    M., Uscinski, J., Klofstad, C

    Enders, A. M., Uscinski, J., Klofstad, C. & Stoler, J. On the relationship between conspiracy theory beliefs, misinformation, and vaccine hesitancy.Plos one 17, e0276082 (2022)

  3. [3]

    Conspiracy theories and violent extremism.Counter Terrorist Trends and Analyses13, 1–9 (2021)

    Basit, A. Conspiracy theories and violent extremism.Counter Terrorist Trends and Analyses13, 1–9 (2021)

  4. [4]

    & Bruder, M

    Imhoff, R. & Bruder, M. Speaking (un–) truth to power: Conspiracy mentality as a generalised political attitude. European Journal of Personality28, 25–43 (2014)

  5. [5]

    Sutton, R. M. & Douglas, K. M. 14 examining the monological nature of con- spiracy theories. Power Polit. Paranoia Why People Are Suspicious Their Lead 29, 254–272 (2014)

  6. [6]

    M., Sutton, R

    Douglas, K. M., Sutton, R. M. & Cichocka, A. The psychology of conspiracy theories. Current directions in psychological science26, 538–542 (2017)

  7. [7]

    & Wang, H

    Van Prooijen, J.-W., Spadaro, G. & Wang, H. Suspicion of institutions: How distrust and conspiracy theories deteriorate social relationships.Current opinion in psychology43, 65–69 (2022)

  8. [8]

    & Vliegenthart, R

    Strömbäck, J., Broda, E., Tsfati, Y., Kossowska, M. & Vliegenthart, R. Disen- tangling the relationship between conspiracy mindset versus beliefs in specific conspiracy theories. Zeitschrift für Psychologie232, 18 (2024)

  9. [9]

    & Imhoff, R

    Frenken, M. & Imhoff, R. A uniform conspiracy mindset or differentiated reactions to specific conspiracy beliefs? evidence from latent profile analyses. International Review of Social Psychology34 (2021)

  10. [10]

    Imhoff, R. et al. Conspiracy mentality and political orientation across 26 countries. Nature human behaviour6, 392–403 (2022)

  11. [11]

    Kroke, A. M. & Ruthig, J. C. Conspiracy beliefs predicting health behaviors: an integration of the theory of planned behavior and health belief model.Current Psychology 43, 7959–7973 (2024)

  12. [12]

    An automated pipeline for the discovery of conspiracy and conspiracy theory narrative frameworks: Bridge- gate, Pizzagate and storytelling on the web

    Tangherlini, T. R., Shahsavari, S., Shahbazi, B., Ebrahimzadeh, E. & Roychowd- hury, V. An automated pipeline for the discovery of conspiracy and conspiracy theory narrative frameworks: Bridgegate, Pizzagate and storytelling on the web. PLOS ONE15,e0233879(2020). URLhttps://journals.plos.org/plosone/article? 21 id=10.1371/journal.pone.0233879. Publisher: ...

  13. [13]

    & Farid, H

    Faddoul, M., Chaslot, G. & Farid, H. A Longitudinal Analysis of YouTube’s Promotion of Conspiracy Videos (2020). URL http://arxiv.org/abs/2003.03318. ArXiv:2003.03318 [cs]

  14. [14]

    ’The Government Spies Using Our Webcams’: The Language of Conspiracy Theories in Online Discus- sions

    Samory, M. & Mitra, T. ’the government spies using our webcams’: The language of conspiracy theories in online discussions.Proc. ACM Hum.-Comput. Interact. 2 (2018). URL https://doi.org/10.1145/3274421

  15. [15]

    Naab, T. K. & Küchler, C. Content Analysis in the Research Field of Online User Comments, 441–450 (Springer Fachmedien Wiesbaden, Wiesbaden, 2023). URL https://doi.org/10.1007/978-3-658-36179-2_37

  16. [16]

    in The impact of the internet and social media platforms on rad- icalisation to terrorism and violent extremism(eds Montasari, R., Carroll, F., Mitchell, I., Hara, S

    Gunton, K. in The impact of the internet and social media platforms on rad- icalisation to terrorism and violent extremism(eds Montasari, R., Carroll, F., Mitchell, I., Hara, S. & Bolton-King, R.)Privacy, Security And Forensics in The Internet of Things (IoT)167–177 (Springer, 2022)

  17. [17]

    The echo chamber effect on social media.Proceedings of the national academy of sciences118, e2023301118 (2021)

    Cinelli,M.,DeFrancisciMorales,G.,Galeazzi,A.,Quattrociocchi,W.&Starnini, M. The echo chamber effect on social media.Proceedings of the national academy of sciences118, e2023301118 (2021)

  18. [18]

    Sunstein, C. R. Republic: Divided democracy in the age of social media (2018)

  19. [19]

    & Dunn, A

    Klein, C., Clutton, P. & Dunn, A. G. Pathways to conspiracy: The social and linguistic precursors of involvement in reddit’s conspiracy theory forum.PloS one 14, e0225098 (2019)

  20. [20]

    & Mitra, T

    Phadke, S., Samory, M. & Mitra, T. Pathways through conspiracy: the evolution of conspiracy radicalization through engagement in online conspiracy discussions (2022)

  21. [21]

    & Mitra, T

    Samory, M. & Mitra, T. Conspiracies Online: User Discussions in a Conspiracy Community Following Dramatic Events.Proceedings of the International AAAI Conference on Web and Social Media12(2018). URL https://ojs.aaai.org/index. php/ICWSM/article/view/15039. Number: 1

  22. [22]

    A Theory of Cognitive Dissonance (Stanford University Press, 1957)

    Festinger, L. A Theory of Cognitive Dissonance (Stanford University Press, 1957)

  23. [23]

    & Furl, K

    Marwick, A., Clancy, B. & Furl, K. Far-right online radicalization: A review of the literature. The Bulletin of Technology & Public Life(2022)

  24. [24]

    & Heft, A

    Buehling, K., Zhang, X. & Heft, A. Veiled conspiracism: Particularities and con- vergence in the styles and functions of conspiracy-related communication across digital platforms. New Media & Society14614448251315756 (2025). 22

  25. [25]

    Williams, T. J. V. & Tzani, C. How does language influence the radicalisation process? a systematic review of research exploring online extremist communica- tion and discussion. Behavioral Sciences of Terrorism and Political Aggression 16, 310–330 (2024)

  26. [26]

    H., Pennycook, G

    Costello, T. H., Pennycook, G. & Rand, D. G. Durably reducing conspiracy beliefs through dialogues with ai.Science 385, eadq1814 (2024)

  27. [27]

    Dyer, K. D. & Hall, R. E. Effect of critical thinking education on epistemically unwarranted beliefs in college students.Research in Higher Education60, 293– 314 (2019)

  28. [28]

    Kunst, J. R.et al. Leveraging artificial intelligence to identify the psychological factors associated with conspiracy theory beliefs online.Nature Communications 15, 7497 (2024)

  29. [29]

    & Linehan, C

    O’Mahony, C., Brassil, M., Murphy, G. & Linehan, C. The efficacy of interven- tions in reducing belief in conspiracy theories: A systematic review.PLoS One 18, e0280902 (2023)

  30. [30]

    Study conspiracy theories with compassion.Nature 603, 765 (2022)

    Drążkiewicz, E. Study conspiracy theories with compassion.Nature 603, 765 (2022). URL https://www.nature.com/articles/d41586-022-00879-w

  31. [31]

    Dodds, M.Uncertain Conspiracies: A Latourian Analysis of R/Conspiracy in an Era of Global Upheaval. Ph.D. thesis, Concordia University (2021)

  32. [32]

    & Mitra, T

    Phadke, S., Samory, M. & Mitra, T. What makes people join conspiracy commu- nities? role of social factors in conspiracy engagement.Proceedings of the ACM on Human-Computer Interaction4, 1–30 (2021)

  33. [33]

    Sutton, R. M. & Douglas, K. M. Rabbit hole syndrome: Inadvertent, accelerating, andentrenchedcommitmenttoconspiracybeliefs. Current Opinion in Psychology 48, 101462 (2022)

  34. [34]

    Russo, G., Ribeiro, M. H. & West, R. Stranger danger! cross-community inter- actions with fringe users increase the growth of fringe communities on reddit (2024)

  35. [35]

    & Mitra, T

    Engel, K., Phadke, S. & Mitra, T. Learning from the ex-believers: Individuals’ journeys in and out of conspiracy theories online. Proc. ACM Hum.-Comput. Interact.7 (2023). URL https://doi.org/10.1145/3610076

  36. [36]

    & Verginer, L

    Russo, G., Horta Ribeiro, M., Casiraghi, G. & Verginer, L. Understanding online migration decisions following the banning of radical communities (2023)

  37. [37]

    & Van Der Linden, S

    Fong, A., Roozenbeek, J., Goldwert, D., Rathje, S. & Van Der Linden, S. The language of conspiracy: A psychological analysis of speech used by conspiracy 23 theorists and their followers on twitter.Group Processes & Intergroup Relations 24, 606–623 (2021)

  38. [38]

    & Potts, C

    Danescu-Niculescu-Mizil, C., West, R., Jurafsky, D., Leskovec, J. & Potts, C. No country for old members: User lifecycle and linguistic change in online communities (2013)

  39. [39]

    & Lea, M

    Postmes, T., Spears, R. & Lea, M. The formation of group norms in computer- mediated communication.Human communication research26, 341–371 (2000)

  40. [40]

    & Sastry, N

    Zhong, C., Chang, H.-w., Karamshuk, D., Lee, D. & Sastry, N. Wearing many (social) hats: How different are your different social network personae? (2017)

  41. [41]

    Lahnala, A., Varadarajan, V., Flek, L., Schwartz, H. A. & Boyd, R. L. Unifying the extremes: Developing a unified model for detecting and predicting extremist traits and radicalization.arXiv preprint arXiv:2501.04820(2025)

  42. [42]

    & Dumais, S

    Danescu-Niculescu-Mizil, C., Gamon, M. & Dumais, S. Mark my words! linguis- tic style accommodation in social media (2011). URL https://doi.org/10.1145/ 1963405.1963509

  43. [43]

    & Frenken, M

    Imhoff, R., Bertlich, T. & Frenken, M. Tearing apart the “evil” twins: A gen- eral conspiracy mentality is not the same as specific conspiracy beliefs.Current Opinion in Psychology46, 101349 (2022)

  44. [44]

    Fake news and ideological polarization: Filter bubbles and selective exposure on social media.Business information review34, 150–160 (2017)

    Spohr, D. Fake news and ideological polarization: Filter bubbles and selective exposure on social media.Business information review34, 150–160 (2017)

  45. [45]

    Hosseinmardi, H. et al. Examining the consumption of radical content on youtube. Proceedings of the National Academy of Sciences118, e2101967118 (2021)

  46. [46]

    Stroud, N. J. Media use and political predispositions: Revisiting the concept of selective exposure. Political behavior30, 341–366 (2008)

  47. [47]

    Mohammed, S. N. Conspiracy Theories and Flat-Earth Videos on YouTube. Social media and society8, 84–102 (2019). Number: 2 MAG ID: 2998879517 S2ID: fbb99c6cc8d48368afb5ada82b6bb2ae9fbbe3df

  48. [48]

    & Lazer, D

    Grinberg, N., Joseph, K., Friedland, L., Swire-Thompson, B. & Lazer, D. Fake news on twitter during the 2016 us presidential election.Science 363, 374–378 (2019)

  49. [49]

    & Nithyanand, R

    Habib, H., Srinivasan, P. & Nithyanand, R. Making a radical misogynist: How online social engagement with the manosphere influences traits of radicalization. Proc. ACM Hum.-Comput. Interact.6 (2022). URL https://doi.org/10.1145/ 3555551. 24

  50. [50]

    & Mark, G

    Grover, T. & Mark, G. Detecting potential warning behaviors of ideological radicalization in an alt-right subreddit (2019)

  51. [51]

    A theory of freedom of expression

    Scanlon, T. A theory of freedom of expression. Philosophy & Public Affairs 204–226 (1972)

  52. [52]

    & Starnini, M

    Monti, C., Cinelli, M., Valensise, C., Quattrociocchi, W. & Starnini, M. Online conspiracy communities are more resilient to deplatforming. PNAS Nexus 2, pgad324 (2023). URL https://doi.org/10.1093/pnasnexus/pgad324

  53. [53]

    M., Lin, H., Xu, B

    Tadesse, M. M., Lin, H., Xu, B. & Yang, L. Detection of depression-related posts in reddit social media forum.Ieee Access7, 44883–44893 (2019)

  54. [54]

    Faasse, K., Chatman, C. J. & Martin, L. R. A comparison of language use in pro-and anti-vaccination comments in response to a high profile facebook post. Vaccine34, 5808–5814 (2016)

  55. [55]

    & Rosso, P

    Giachanou, A., Ghanem, B. & Rosso, P. Detection of conspiracy propagators using psycho-linguistic characteristics.Journal of Information Science49, 3–17 (2023)

  56. [56]

    Behavior research methods50, 344–361 (2018)

    Garten, J.et al.Dictionaries and distributions: Combining expert knowledge and large scale textual data content analysis: Distributed dictionary representation. Behavior research methods50, 344–361 (2018)

  57. [57]

    Sun, N., Rau, P. P.-L. & Ma, L. Understanding lurkers in online communities: A literature review. Computers in Human Behavior38, 110–117 (2014)

  58. [58]

    & Blackburn, J

    Baumgartner, J., Zannettou, S., Keegan, B., Squire, M. & Blackburn, J. The pushshift reddit dataset (2020)

  59. [59]

    M., Cinelli, M., Galeazzi, A

    Valensise, C. M., Cinelli, M., Galeazzi, A. & Quattrociocchi, W. Drifts and shifts: characterizing the evolution of users interests on reddit.arXiv preprint arXiv:1912.09210 (2019)

  60. [60]

    & Panisson, A

    Rollo, C., De Francisci Morales, G., Monti, C. & Panisson, A. Communities, gateways, and bridges: Measuring attention flow in the reddit political sphere (2022)

  61. [61]

    L., Ashokkumar, A., Seraj, S

    Boyd, R. L., Ashokkumar, A., Seraj, S. & Pennebaker, J. W. The development andpsychometricpropertiesofliwc-22. Austin, TX: University of Texas at Austin 10 (2022)

  62. [62]

    & Lim, E

    Silva, A., Lo, P.-C. & Lim, E. P. On predicting personal values of social media users using community-specific language features and personal value correlation (2021). 25

  63. [63]

    Tausczik, Y. R. & Pennebaker, J. W. The psychological meaning of words: Liwc and computerized text analysis methods. Journal of language and social psychology 29, 24–54 (2010)

  64. [64]

    Random forests.Machine learning45, 5–32 (2001)

    Breiman, L. Random forests.Machine learning45, 5–32 (2001)

  65. [65]

    Sutton, R. M. & Douglas, K. M. Conspiracy theories and the conspiracy mind- set: implications for political ideology.Current Opinion in Behavioral Sciences 34, 118–122 (2020). URL https://www.sciencedirect.com/science/article/pii/ S2352154620300358

  66. [66]

    Swami, V.et al.Conspiracist ideation in britain and austria: Evidence of a mono- logical belief system and associations between individual psychological differences and real-world and fictitious conspiracy theories.British Journal of Psychology 102, 443–463 (2011)

  67. [67]

    Aunifiedapproachtointerpretingmodelpredictions

    Lundberg,S.M.&Lee,S.-I. Aunifiedapproachtointerpretingmodelpredictions. Advances in neural information processing systems30 (2017)

  68. [68]

    & Michener, C

    Sokal, R. & Michener, C. A statistical method for evaluating systematic relation- ships: The university of kansas science bulletin, v. 38.Sokal104938University of Kansas Science Bulletin19581049–1438 (1958)

  69. [69]

    & Garriga, G

    Ojala, M. & Garriga, G. C. Permutation tests for studying classifier performance. Journal of machine learning research11 (2010)

  70. [70]

    Areanti-feministcommunitiesgateways to the far right? evidence from reddit and youtube (2021)

    Mamié,R.,HortaRibeiro,M.&West,R. Areanti-feministcommunitiesgateways to the far right? evidence from reddit and youtube (2021)

  71. [71]

    Nature 637, 319–326 (2025)

    Hollmann, N.et al.Accurate predictions on small data with a tabular foundation model. Nature 637, 319–326 (2025). 26 Fig. A1 Bias in the selection of subreddits using the activity of conspiracy users as proxy. The users we consider are those with at least 100 comments onr/conspiracy. Appendix A Dataset Information Table A1 Data sizes for each subreddit, i...