Anti-Latinx Computational Propaganda in the United States
Pith reviewed 2026-05-25 15:36 UTC · model grok-4.3
The pith
Extremist voices on Reddit produce most political content about Latinos before US elections.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Collection and analysis of Reddit posts mentioning Latinos and the US midterm elections from September 2017 to September 2018 reveals a lack of neutral actors; instead, individuals operating in subreddits that self-identify as political trolls generate the majority of the political content, highlighting data voids that precede elections and can be exploited by mis- and disinformation.
What carries the argument
Temporal and user-level analysis of Reddit posts and subreddit affiliations to trace which accounts create the largest volume and most popular content about Latinos.
If this is right
- Data voids in Latino-related election talk leave room for mis- and disinformation to take hold.
- Extremist actors fill the absence of neutral discussion, shaping the visible narrative.
- Greater involvement by the Latino community itself could reduce the relative influence of these voids.
- Similar voids may recur in future election cycles if neutral participation stays low.
Where Pith is reading between the lines
- The same Reddit patterns could be checked on other platforms to see whether extremist dominance is platform-specific.
- Election officials or community groups might test targeted outreach campaigns to increase neutral Latino content volume.
- If subreddit self-labels prove unstable over time, the identification of extremist actors would need repeated validation.
Load-bearing premise
The sampled Reddit posts and the labeling of subreddits as extremist accurately reflect the full range of online talk about Latinos without major gaps or misclassifications.
What would settle it
A re-analysis of the same Reddit dataset that finds neutral or mainstream users accounting for more than half of the political posts about Latinos during the study period would disprove the dominance of extremist voices.
Figures
read the original abstract
Given that the Latino community is the second largest ethnic group in the US, an understanding of how Latinos are discussed and targeted on social media during US elections is crucial. This paper explores these questions through a data analysis on Reddit, one of the most prominent and popular social media platforms for political discussion. We collected Reddit posts mentioning Latinos and the US midterm elections from September 24, 2017 to September 24, 2018. We analyzed people's posting patterns over time, and the digital traces of the individuals posting the majority of content and the most popular content. Our research highlights data voids that existed in online discussions surrounding Latinos prior to the US midterm elections. We observe a lack of neutral actors engaging Latinos in political topics. It appears that it is the more extremist voices (i.e. individuals operating within subreddits who identify themselves as political trolls) who are creating the most political content about Latinos. We conclude our report with a discussion of the possible dangers of data voids (especially with regard to their ties to mis- and disinformation) and recommendations to increase the involvement of the Latino community in future US elections.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript analyzes Reddit posts mentioning Latinos in relation to the 2018 US midterm elections collected between September 2017 and September 2018. It examines posting patterns, identifies data voids, notes the absence of neutral actors, and concludes that extremist voices from subreddits self-identifying as political trolls generate the majority of political content about Latinos. The paper discusses dangers of data voids linked to mis- and disinformation and recommends greater Latino community engagement in elections.
Significance. If the classification of extremist voices proves valid and the data collection is comprehensive and unbiased, the work could contribute observational evidence on how data voids in ethnic political discussions on Reddit may be filled by polarized actors, informing studies of computational propaganda. The focus on a specific ethnic group and election cycle adds to the literature on online discourse, though the absence of methodological transparency currently constrains its potential impact.
major comments (3)
- [Abstract] Abstract: The central claim that 'it is the more extremist voices (i.e. individuals operating within subreddits who identify themselves as political trolls) who are creating the most political content about Latinos' is presented without any quantitative support such as total post counts, percentage of content volume attributable to the labeled group, or statistical comparison to other actors, making the dominance finding impossible to evaluate.
- [Data collection and analysis] Data collection and analysis (throughout): No details are supplied on the total number of posts collected, the precise keywords or Reddit API queries used to gather mentions of Latinos and elections, how 'political content' was operationalized versus general mentions, or the sample sizes underlying the time-series and popularity analyses, which are required to substantiate observations of data voids and posting patterns.
- [Abstract and results] Classification of extremist voices (Abstract and results): The labeling of subreddits and users as 'extremist' or 'political trolls' via subreddit membership and self-identification lacks any stated validation criteria, inter-rater reliability measures, or explicit decision rules; this classification is load-bearing for the claim that these voices dominate content production, yet no evidence is given that the labeling process avoids selection or confirmation bias.
minor comments (1)
- [Abstract] The abstract would be strengthened by including at least one key quantitative result (e.g., fraction of content from labeled extremist accounts) to allow readers to assess the strength of the main observation.
Simulated Author's Rebuttal
We thank the referee for their constructive feedback, which highlights important areas for improving transparency and evaluability. We address each major comment point by point below and will make revisions to strengthen the manuscript where the comments identify gaps in presentation or detail.
read point-by-point responses
-
Referee: [Abstract] Abstract: The central claim that 'it is the more extremist voices (i.e. individuals operating within subreddits who identify themselves as political trolls) who are creating the most political content about Latinos' is presented without any quantitative support such as total post counts, percentage of content volume attributable to the labeled group, or statistical comparison to other actors, making the dominance finding impossible to evaluate.
Authors: We agree that the abstract would benefit from explicit quantitative support to make the dominance claim immediately evaluable. The manuscript body contains time-series and popularity analyses supporting the observation, but these are not summarized numerically in the abstract. We will revise the abstract to include key metrics such as total posts collected, the share of political content from the identified subreddits, and a brief comparison to other actor types. revision: yes
-
Referee: [Data collection and analysis] Data collection and analysis (throughout): No details are supplied on the total number of posts collected, the precise keywords or Reddit API queries used to gather mentions of Latinos and elections, how 'political content' was operationalized versus general mentions, or the sample sizes underlying the time-series and popularity analyses, which are required to substantiate observations of data voids and posting patterns.
Authors: We concur that full methodological transparency is necessary for reproducibility and evaluation of the data-void and posting-pattern claims. The current manuscript describes the collection window but omits granular details. We will add a dedicated Methods section specifying the Reddit API queries, exact keywords (including variants for 'Latino', 'Hispanic', and election-related terms), total posts collected, operationalization of 'political content' (via keyword filtering followed by manual review of a subsample), and sample sizes for all reported analyses. revision: yes
-
Referee: [Abstract and results] Classification of extremist voices (Abstract and results): The labeling of subreddits and users as 'extremist' or 'political trolls' via subreddit membership and self-identification lacks any stated validation criteria, inter-rater reliability measures, or explicit decision rules; this classification is load-bearing for the claim that these voices dominate content production, yet no evidence is given that the labeling process avoids selection or confirmation bias.
Authors: The labeling is grounded in explicit subreddit self-identification as political trolls rather than subjective content coding, which is a common and defensible method in computational social science to reduce coder bias. No inter-rater reliability was computed because the assignment derives directly from community self-descriptions and known affiliations, not interpretive judgment. To address concerns about transparency and potential bias, we will add an appendix listing the specific subreddits, with verbatim excerpts from their descriptions demonstrating self-identification, and a short discussion of how this membership-based approach limits confirmation bias. revision: yes
Circularity Check
No circularity: purely observational data analysis with no derivations or self-referential reductions
full rationale
The paper reports an empirical study collecting and analyzing Reddit posts mentioning Latinos during the 2018 US midterm period. No equations, fitted parameters, predictions, or first-principles derivations appear in the provided abstract or description. Claims about posting patterns and 'extremist voices' rest on direct observation of collected data rather than any chain that reduces outputs to inputs by construction. Self-citation is not invoked as load-bearing justification for any uniqueness or ansatz. This matches the reader's assessment of circularity score 1.0 and qualifies as a standard non-finding for an observational social-media study.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Bokhari A., M., Yiannopoulos. (2016). An establishment conservative’s guide to the alt-right | breit- bart. https://www.breitbart.com/tech/2016/03/29/an-estab/l.Varishment-conservatives-guide -to-the-a/l.Vart-right/. Bosquez, C. P., A. (2018). Naleo ef tracking. https://www./l.Varatinodecisions.com/b/l.Varog/2018/09/11/ near/l.Vary-ha/l.Varf-of-/l.Varatin...
work page 2016
-
[2]
Sunstein, C. (2018). Cass r. sunstein: Is social media good or bad for democracy? | facebook newsroom. https:// newsroom.fb.com/news/2018/01/sunstein-democracy/. Tucker, J., Guess, A., Barberá, P., Vaccari, C., Siegel, A., Sanovich, S., . . . Nyhan, B. (2018). Social media, political polarization, and political disinformation: A review of the scientific l...
work page 2018
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.