Using Word Embeddings to Examine Gender Bias in Dutch Newspapers, 1950-1990

Melvin Wevers

arxiv: 1907.08922 · v1 · pith:ZIHJXSSNnew · submitted 2019-07-21 · 💻 cs.CL · cs.CY· cs.LG· stat.ML

Using Word Embeddings to Examine Gender Bias in Dutch Newspapers, 1950-1990

Melvin Wevers This is my paper

Pith reviewed 2026-05-24 18:53 UTC · model grok-4.3

classification 💻 cs.CL cs.CYcs.LGstat.ML

keywords gender biasword embeddingsDutch newspapershistorical language changeideological divergencedepillarizationmedia bias

0 comments

The pith

Word embeddings trained separately on six Dutch newspapers from 1950 to 1990 reveal gender bias shifting toward men overall while diverging by newspaper ideology.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper trains word embedding models on separate corpora from six national Dutch newspapers with different ideological backgrounds to measure local changes in word associations involving gender. It finds that bias moved toward women in themes such as sexuality and leisure but shifted toward men in general trends, even with rising female employment and feminist activity. Despite depillarization reducing ideological divides in Dutch society, the embeddings show growing divergence in gender bias between religious and social-democratic papers on one side and liberal papers on the other. A reader would care because this supplies quantitative evidence on how media language tracked or resisted social shifts around gender roles. The work also positions word embeddings as a practical method for studying historical language change related to bias.

Core claim

By comparing local changes in word embedding models trained on newspapers with divergent ideological backgrounds, the study demonstrates clear differences in gender bias and changes within and between newspapers over time. In relation to themes such as sexuality and leisure, the bias moves toward women, whereas generally the bias shifts in the direction of men, despite growing female employment numbers and feminist movements. Even though Dutch society became less stratified ideologically through depillarization, an increasing divergence in gender bias is found between religious and social-democratic newspapers on the one hand and liberal newspapers on the other.

What carries the argument

Separately trained word embedding models on ideologically divergent newspaper corpora, compared via local changes in associations between gender terms and other words.

Load-bearing premise

Local changes in word associations from separately trained embedding models on different newspaper corpora directly and comparably reflect real differences in gender bias in the underlying texts rather than corpus statistics or training artifacts.

What would settle it

A manual review of gender associations in random samples of articles from the same newspapers that shows no matching patterns of shift or divergence by ideology would indicate the embeddings are not tracking the claimed bias differences.

read the original abstract

Contemporary debates on filter bubbles and polarization in public and social media raise the question to what extent news media of the past exhibited biases. This paper specifically examines bias related to gender in six Dutch national newspapers between 1950 and 1990. We measure bias related to gender by comparing local changes in word embedding models trained on newspapers with divergent ideological backgrounds. We demonstrate clear differences in gender bias and changes within and between newspapers over time. In relation to themes such as sexuality and leisure, we see the bias moving toward women, whereas, generally, the bias shifts in the direction of men, despite growing female employment number and feminist movements. Even though Dutch society became less stratified ideologically (depillarization), we found an increasing divergence in gender bias between religious and social-democratic on the one hand and liberal newspapers on the other. Methodologically, this paper illustrates how word embeddings can be used to examine historical language change. Future work will investigate how fine-tuning deep contextualized embedding models, such as ELMO, might be used for similar tasks with greater contextual information.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper applies embeddings to Dutch newspaper gender bias but the cross-model comparisons lack any alignment step, so the reported shifts and divergences are hard to trust.

read the letter

The main takeaway is that this work trains separate word embeddings on six Dutch newspapers from 1950-1990 and claims to find directional changes in gender associations plus growing divergence between liberal and religious/social-democratic papers. That specific historical application is new and the framing around depillarization and feminist-era employment trends is reasonable on its face. The abstract also positions the piece as a methodological illustration for historical language change, which is fair enough for a short piece. What it does well is pick a concrete national corpus and tie the bias measures to real-world themes like sexuality, leisure, and occupation. The directional claims (bias moving toward women on some topics, toward men overall) are stated plainly. The soft spot is exactly the one the stress-test flags: independently trained embedding spaces are only comparable up to orthogonal transformation, yet the abstract gives no sign of Procrustes, CCA, or anchor-word alignment. Without that, cosine shifts or nearest-neighbor changes between models can arise from initialization or vocabulary differences alone. The paper also supplies no equations for the bias metric, no statistical tests, and no corpus or training details, so the central claims rest on unverified assumptions about what the local changes actually capture. This is for digital-humanities or computational-social-science readers who want to see embeddings tried on historical media bias; someone already working on Dutch newspapers or temporal embedding methods might extract a useful baseline idea. It deserves peer review so the authors can either add the alignment step and full methods or clarify why the comparisons still hold.

Referee Report

1 major / 1 minor

Summary. The paper trains separate word embedding models on six Dutch national newspapers (1950-1990) spanning different ideological backgrounds and compares gender bias via local changes in word associations. It claims directional shifts (bias moving toward women for sexuality/leisure themes but toward men generally) and increasing divergence between religious/social-democratic versus liberal papers despite depillarization, while illustrating embeddings for historical language analysis.

Significance. If the cross-model comparisons hold, the work offers a methodological illustration of embeddings for detecting historical bias trends in ideologically distinct corpora. No machine-checked proofs, reproducible code, or parameter-free derivations are described, so the significance rests entirely on the empirical claims about bias trajectories.

major comments (1)

[Abstract and Methods] Abstract and Methods (implied): The central claim requires comparing cosine similarities or nearest-neighbor relations between gender terms and target words across six independently trained embedding models. No alignment procedure (Procrustes, CCA, or shared anchors) is reported. Because skip-gram/CBOW spaces are defined only up to orthogonal transformation, reported 'local changes' and 'increasing divergence' between newspapers can be artifacts of random initialization or vocabulary differences rather than textual bias shifts. This is load-bearing for all temporal and cross-paper findings.

minor comments (1)

[Abstract] The abstract provides no description of the bias metric, statistical tests, error bars, corpus construction details, or model hyperparameters, making it impossible to assess whether reported directional shifts are supported.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for identifying this methodological concern, which is central to the validity of our cross-model comparisons. We address it directly below.

read point-by-point responses

Referee: [Abstract and Methods] Abstract and Methods (implied): The central claim requires comparing cosine similarities or nearest-neighbor relations between gender terms and target words across six independently trained embedding models. No alignment procedure (Procrustes, CCA, or shared anchors) is reported. Because skip-gram/CBOW spaces are defined only up to orthogonal transformation, reported 'local changes' and 'increasing divergence' between newspapers can be artifacts of random initialization or vocabulary differences rather than textual bias shifts. This is load-bearing for all temporal and cross-paper findings.

Authors: We agree that the absence of an alignment procedure is a substantive limitation for any claims involving comparisons across the six independently trained models. The manuscript as submitted does not describe or apply such a step (e.g., Procrustes, CCA, or anchor-based alignment), so the reported shifts and divergence could indeed be influenced by rotational differences or vocabulary mismatches. In the revised manuscript we will (1) implement Procrustes alignment of all six embedding spaces to a common reference space using a fixed set of high-frequency shared anchor words, (2) document the alignment procedure and the choice of anchors, and (3) re-compute the gender-bias measures and divergence statistics on the aligned spaces. We will also add a sensitivity check comparing results before and after alignment. This revision will be reflected in both the Methods and Results sections. revision: yes

Circularity Check

0 steps flagged

No significant circularity in the paper's methodological derivation.

full rationale

The paper trains independent embedding models on separate historical newspaper corpora and extracts association-based gender bias measures directly from those models. These operations are applied to external text data with no reported fitting of parameters that are subsequently renamed as predictions, no self-definitional reductions, and no load-bearing self-citations that justify uniqueness or ansatzes. The derivation chain is self-contained as a descriptive measurement technique and does not reduce any claimed result to its own inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Abstract-only review; central claim rests on the domain assumption that embedding geometry faithfully encodes societal gender bias and that models trained on different ideological corpora are directly comparable. No free parameters or invented entities are visible in the abstract.

axioms (1)

domain assumption Word embeddings trained on newspaper text capture semantic associations that reflect gender bias in the source material.
Implicit foundation for using local changes in embeddings to measure bias.

pith-pipeline@v0.9.0 · 5721 in / 1332 out tokens · 23291 ms · 2026-05-24T18:53:20.566221+00:00 · methodology

Using Word Embeddings to Examine Gender Bias in Dutch Newspapers, 1950-1990

Core claim

What carries the argument

Load-bearing premise

What would settle it

discussion (0)