Anti-Latinx Computational Propaganda in the United States

Claudia Flores-Saviaga; Saiph Savage

arxiv: 1906.10736 · v2 · pith:XBNWZWRQnew · submitted 2019-06-25 · 💻 cs.SI

Anti-Latinx Computational Propaganda in the United States

Claudia Flores-Saviaga , Saiph Savage This is my paper

Pith reviewed 2026-05-25 15:36 UTC · model grok-4.3

classification 💻 cs.SI

keywords RedditLatinosUS electionscomputational propagandadata voidsextremist voicespolitical trollsmisinformation

0 comments

The pith

Extremist voices on Reddit produce most political content about Latinos before US elections.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper examines Reddit posts that mention Latinos alongside the 2018 US midterm elections to map who drives online discussion of this community. It finds that content comes overwhelmingly from users active in subreddits that identify as political trolls, with few neutral participants filling the space. This pattern leaves data voids where misinformation can spread unchecked. Because Latinos form the second-largest ethnic group in the US, the dominance of one-sided voices can shape voter perceptions and turnout in ways that affect electoral outcomes. The authors close by linking these voids to risks of disinformation and calling for greater Latino community engagement to fill them.

Core claim

Collection and analysis of Reddit posts mentioning Latinos and the US midterm elections from September 2017 to September 2018 reveals a lack of neutral actors; instead, individuals operating in subreddits that self-identify as political trolls generate the majority of the political content, highlighting data voids that precede elections and can be exploited by mis- and disinformation.

What carries the argument

Temporal and user-level analysis of Reddit posts and subreddit affiliations to trace which accounts create the largest volume and most popular content about Latinos.

If this is right

Data voids in Latino-related election talk leave room for mis- and disinformation to take hold.
Extremist actors fill the absence of neutral discussion, shaping the visible narrative.
Greater involvement by the Latino community itself could reduce the relative influence of these voids.
Similar voids may recur in future election cycles if neutral participation stays low.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same Reddit patterns could be checked on other platforms to see whether extremist dominance is platform-specific.
Election officials or community groups might test targeted outreach campaigns to increase neutral Latino content volume.
If subreddit self-labels prove unstable over time, the identification of extremist actors would need repeated validation.

Load-bearing premise

The sampled Reddit posts and the labeling of subreddits as extremist accurately reflect the full range of online talk about Latinos without major gaps or misclassifications.

What would settle it

A re-analysis of the same Reddit dataset that finds neutral or mainstream users accounting for more than half of the political posts about Latinos during the study period would disprove the dominance of extremist voices.

Figures

Figures reproduced from arXiv: 1906.10736 by Claudia Flores-Saviaga, Saiph Savage.

**Figure 2.** Figure 2: Overview of how much each individual person posted on Reddit and the attention they received from [PITH_FULL_IMAGE:figures/full_fig_p009_2.png] view at source ↗

**Figure 3.** Figure 3: Example of political mega-thread from r/The_Donald where they discuss the political ecosystem, including political topics related to Latinos. • User Type B (“Pro-Trump + trolls”): All authors in this group (41.5% of most active users) belonged to r/The_Donald, a community known for its political trolling behaviour (Flores-Saviaga, 2018). Their posts focused on mobilizing people to vote Republican (pro… view at source ↗

**Figure 4.** Figure 4: Meme from r/The_Donald mocking Latinos. All of these active users posted about the current immigration situation of the United States and occasionally posted news about crimes that illegal immigrants had done. There was also a tendency to use such news reports to show special favoritism towards Trump and his decisions relating to illegal immigrant: “Dad’s grief leads to a quest to count deaths caused by i… view at source ↗

**Figure 5.** Figure 5: Example of a post on r/The_Donald. • User Type C (“The Neutrals”) This group (41.5% of the most active users) had a more neutral view on the topic of Latinos and their rights in the US They primarily posted news reports from sites that are known to have a neutral tone. For this reason, we called the groups “The Neutrals”. 35% of the posts of these users were mega-threads where they discussed the politica… view at source ↗

read the original abstract

Given that the Latino community is the second largest ethnic group in the US, an understanding of how Latinos are discussed and targeted on social media during US elections is crucial. This paper explores these questions through a data analysis on Reddit, one of the most prominent and popular social media platforms for political discussion. We collected Reddit posts mentioning Latinos and the US midterm elections from September 24, 2017 to September 24, 2018. We analyzed people's posting patterns over time, and the digital traces of the individuals posting the majority of content and the most popular content. Our research highlights data voids that existed in online discussions surrounding Latinos prior to the US midterm elections. We observe a lack of neutral actors engaging Latinos in political topics. It appears that it is the more extremist voices (i.e. individuals operating within subreddits who identify themselves as political trolls) who are creating the most political content about Latinos. We conclude our report with a discussion of the possible dangers of data voids (especially with regard to their ties to mis- and disinformation) and recommendations to increase the involvement of the Latino community in future US elections.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Paper maps Reddit posts on Latinos in 2018 midterms and claims extremist voices dominate content, but the classification of those voices has no visible validation.

read the letter

The main takeaway is that this paper pulls Reddit posts mentioning Latinos and the 2018 elections over a one-year window, tracks posting patterns, and concludes that extremist voices in certain subreddits produce most of the political content while neutral actors are scarce. It flags data voids as a risk for mis- and disinformation. That is the core observation the authors want readers to take away. The work applies standard social media collection and volume analysis to this narrow slice of election discourse. It does a service by documenting the time period and the absence of balanced voices on the platform. The point about data voids having downstream risks is reasonable to surface for researchers who track online ethnic-group coverage. The soft spot sits in the labeling step. The claim that extremist voices drive the bulk of content depends on correctly tagging users and subreddits as trolls or extremists, yet the abstract supplies no criteria, no reliability check, and no test of whether the result changes under different labeling rules. If the category is built from subreddit names or loose self-identification, the dominance finding can be shaped by the definition itself rather than raw volume. Sample sizes, exact query terms, and how popular content was ranked are also absent, so the scale of the voids cannot be judged. This is incremental work aimed at computational social scientists who study election talk or disinformation around specific ethnic groups. A reader already deep in that literature might note the Reddit-specific observations, but the thin methods reporting limits how far the findings can travel. The paper deserves a serious referee because the topic has clear stakes and the data-void angle is worth testing with proper documentation, even though the current version would need heavy revision on classification and transparency.

Referee Report

3 major / 1 minor

Summary. The manuscript analyzes Reddit posts mentioning Latinos in relation to the 2018 US midterm elections collected between September 2017 and September 2018. It examines posting patterns, identifies data voids, notes the absence of neutral actors, and concludes that extremist voices from subreddits self-identifying as political trolls generate the majority of political content about Latinos. The paper discusses dangers of data voids linked to mis- and disinformation and recommends greater Latino community engagement in elections.

Significance. If the classification of extremist voices proves valid and the data collection is comprehensive and unbiased, the work could contribute observational evidence on how data voids in ethnic political discussions on Reddit may be filled by polarized actors, informing studies of computational propaganda. The focus on a specific ethnic group and election cycle adds to the literature on online discourse, though the absence of methodological transparency currently constrains its potential impact.

major comments (3)

[Abstract] Abstract: The central claim that 'it is the more extremist voices (i.e. individuals operating within subreddits who identify themselves as political trolls) who are creating the most political content about Latinos' is presented without any quantitative support such as total post counts, percentage of content volume attributable to the labeled group, or statistical comparison to other actors, making the dominance finding impossible to evaluate.
[Data collection and analysis] Data collection and analysis (throughout): No details are supplied on the total number of posts collected, the precise keywords or Reddit API queries used to gather mentions of Latinos and elections, how 'political content' was operationalized versus general mentions, or the sample sizes underlying the time-series and popularity analyses, which are required to substantiate observations of data voids and posting patterns.
[Abstract and results] Classification of extremist voices (Abstract and results): The labeling of subreddits and users as 'extremist' or 'political trolls' via subreddit membership and self-identification lacks any stated validation criteria, inter-rater reliability measures, or explicit decision rules; this classification is load-bearing for the claim that these voices dominate content production, yet no evidence is given that the labeling process avoids selection or confirmation bias.

minor comments (1)

[Abstract] The abstract would be strengthened by including at least one key quantitative result (e.g., fraction of content from labeled extremist accounts) to allow readers to assess the strength of the main observation.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for their constructive feedback, which highlights important areas for improving transparency and evaluability. We address each major comment point by point below and will make revisions to strengthen the manuscript where the comments identify gaps in presentation or detail.

read point-by-point responses

Referee: [Abstract] Abstract: The central claim that 'it is the more extremist voices (i.e. individuals operating within subreddits who identify themselves as political trolls) who are creating the most political content about Latinos' is presented without any quantitative support such as total post counts, percentage of content volume attributable to the labeled group, or statistical comparison to other actors, making the dominance finding impossible to evaluate.

Authors: We agree that the abstract would benefit from explicit quantitative support to make the dominance claim immediately evaluable. The manuscript body contains time-series and popularity analyses supporting the observation, but these are not summarized numerically in the abstract. We will revise the abstract to include key metrics such as total posts collected, the share of political content from the identified subreddits, and a brief comparison to other actor types. revision: yes
Referee: [Data collection and analysis] Data collection and analysis (throughout): No details are supplied on the total number of posts collected, the precise keywords or Reddit API queries used to gather mentions of Latinos and elections, how 'political content' was operationalized versus general mentions, or the sample sizes underlying the time-series and popularity analyses, which are required to substantiate observations of data voids and posting patterns.

Authors: We concur that full methodological transparency is necessary for reproducibility and evaluation of the data-void and posting-pattern claims. The current manuscript describes the collection window but omits granular details. We will add a dedicated Methods section specifying the Reddit API queries, exact keywords (including variants for 'Latino', 'Hispanic', and election-related terms), total posts collected, operationalization of 'political content' (via keyword filtering followed by manual review of a subsample), and sample sizes for all reported analyses. revision: yes
Referee: [Abstract and results] Classification of extremist voices (Abstract and results): The labeling of subreddits and users as 'extremist' or 'political trolls' via subreddit membership and self-identification lacks any stated validation criteria, inter-rater reliability measures, or explicit decision rules; this classification is load-bearing for the claim that these voices dominate content production, yet no evidence is given that the labeling process avoids selection or confirmation bias.

Authors: The labeling is grounded in explicit subreddit self-identification as political trolls rather than subjective content coding, which is a common and defensible method in computational social science to reduce coder bias. No inter-rater reliability was computed because the assignment derives directly from community self-descriptions and known affiliations, not interpretive judgment. To address concerns about transparency and potential bias, we will add an appendix listing the specific subreddits, with verbatim excerpts from their descriptions demonstrating self-identification, and a short discussion of how this membership-based approach limits confirmation bias. revision: yes

Circularity Check

0 steps flagged

No circularity: purely observational data analysis with no derivations or self-referential reductions

full rationale

The paper reports an empirical study collecting and analyzing Reddit posts mentioning Latinos during the 2018 US midterm period. No equations, fitted parameters, predictions, or first-principles derivations appear in the provided abstract or description. Claims about posting patterns and 'extremist voices' rest on direct observation of collected data rather than any chain that reduces outputs to inputs by construction. Self-citation is not invoked as load-bearing justification for any uniqueness or ansatz. This matches the reader's assessment of circularity score 1.0 and qualifies as a standard non-finding for an observational social-media study.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

The paper is an empirical observational study with no mathematical derivations, free parameters, axioms, or invented entities.

pith-pipeline@v0.9.0 · 5722 in / 1090 out tokens · 32185 ms · 2026-05-25T15:36:18.206262+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

2 extracted references · 2 canonical work pages

[1]

el bronco

Bokhari A., M., Yiannopoulos. (2016). An establishment conservative’s guide to the alt-right | breit- bart. https://www.breitbart.com/tech/2016/03/29/an-estab/l.Varishment-conservatives-guide -to-the-a/l.Vart-right/. Bosquez, C. P., A. (2018). Naleo ef tracking. https://www./l.Varatinodecisions.com/b/l.Varog/2018/09/11/ near/l.Vary-ha/l.Varf-of-/l.Varatin...

work page 2016
[2]

Sunstein, C. (2018). Cass r. sunstein: Is social media good or bad for democracy? | facebook newsroom. https:// newsroom.fb.com/news/2018/01/sunstein-democracy/. Tucker, J., Guess, A., Barberá, P., Vaccari, C., Siegel, A., Sanovich, S., . . . Nyhan, B. (2018). Social media, political polarization, and political disinformation: A review of the scientific l...

work page 2018

[1] [1]

el bronco

Bokhari A., M., Yiannopoulos. (2016). An establishment conservative’s guide to the alt-right | breit- bart. https://www.breitbart.com/tech/2016/03/29/an-estab/l.Varishment-conservatives-guide -to-the-a/l.Vart-right/. Bosquez, C. P., A. (2018). Naleo ef tracking. https://www./l.Varatinodecisions.com/b/l.Varog/2018/09/11/ near/l.Vary-ha/l.Varf-of-/l.Varatin...

work page 2016

[2] [2]

Sunstein, C. (2018). Cass r. sunstein: Is social media good or bad for democracy? | facebook newsroom. https:// newsroom.fb.com/news/2018/01/sunstein-democracy/. Tucker, J., Guess, A., Barberá, P., Vaccari, C., Siegel, A., Sanovich, S., . . . Nyhan, B. (2018). Social media, political polarization, and political disinformation: A review of the scientific l...

work page 2018