Transparent Semantic Change Detection with Dependency-Based Profiles

Bach Phan-Tat; Dirk Geeraerts; Dirk Speelman; Kris Heylen; Stefano De Pascale

arxiv: 2601.02891 · v3 · submitted 2026-01-06 · 💻 cs.CL

Transparent Semantic Change Detection with Dependency-Based Profiles

Bach Phan-Tat , Kris Heylen , Dirk Geeraerts , Stefano De Pascale , Dirk Speelman This is my paper

Pith reviewed 2026-05-16 17:41 UTC · model grok-4.3

classification 💻 cs.CL

keywords semantic change detectiondependency parsinglexical semanticsdiachronic linguisticsword embeddingsinterpretabilitycomputational semantics

0 comments

The pith

A dependency-based method detects lexical semantic changes effectively and outperforms several embedding models while remaining interpretable.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Most modern approaches to detecting lexical semantic change use neural embeddings, which perform well but are opaque. The authors propose using dependency co-occurrence patterns to build word profiles instead. These profiles detect changes effectively and outperform several distributional models on benchmarks. The predictions are plausible and can be inspected directly through the underlying syntactic patterns, offering a transparent alternative.

Core claim

The paper establishes that purely dependency-based profiles, derived from co-occurrence in syntactic structures, suffice to detect semantic change in words across time periods. Quantitative evaluation shows these profiles match or exceed the performance of multiple neural and distributional models, while qualitative review confirms that the detected changes align with plausible linguistic shifts and can be traced to specific dependencies.

What carries the argument

Dependency-based profiles consisting of vectors that record the frequency of syntactic dependency relations a word participates in.

If this is right

Outperforms a number of distributional semantic models on LSC tasks.
Produces plausible and interpretable predictions.
Enables in-depth quantitative and qualitative analysis.
Relies solely on dependency co-occurrence without neural networks.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

These profiles might generalize well to languages with available dependency parsers but limited training data for embeddings.
Future work could integrate dependency information into embedding models to improve both accuracy and explainability.
Historians or linguists could apply this to trace specific syntactic contexts behind meaning shifts in corpora.

Load-bearing premise

That dependency co-occurrence patterns capture sufficient semantic information for reliable change detection without neural embeddings.

What would settle it

Demonstrating that on a standard semantic change benchmark the method consistently underperforms the top embedding models would falsify its claimed effectiveness.

read the original abstract

Most modern computational approaches to lexical semantic change detection (LSC) rely on embedding-based distributional word representations with neural networks. Despite the strong performance on LSC benchmarks, they are often opaque. We investigate an alternative method which relies purely on dependency co-occurrence patterns of words. We demonstrate that it is effective for semantic change detection and even outperforms a number of distributional semantic models. We provide an in-depth quantitative and qualitative analysis of the predictions, showing that they are plausible and interpretable.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Dependency profiles give a transparent alternative to embeddings for lexical semantic change detection with some reported outperformance, but the method risks conflating syntactic co-occurrence shifts with true semantic ones without stronger controls.

read the letter

The core idea here is using dependency co-occurrence patterns from parsed text to build word profiles and measure how those profiles change across time periods as a signal for lexical semantic change. This is presented as a direct, inspectable alternative to neural embeddings. The authors claim the profiles detect changes effectively and outperform several distributional models, backed by both quantitative benchmarks and qualitative review of specific predictions that they describe as plausible and interpretable. That combination of a clear method plus the analysis step is the part that stands out as useful. It lets someone see which syntactic contexts are driving a flagged change rather than treating the output as a black-box distance score. For work where understanding the signal matters, this has practical appeal. The main soft spot is the risk that the profiles are capturing syntactic usage patterns or corpus artifacts instead of semantics. Dependency relations mix syntax and meaning, historical parses can be noisy, and diachronic corpora often vary in genre or domain. Without explicit ablations on parser accuracy over time or checks against non-semantic drift, the outperformance numbers could be partly explained by those factors. The abstract does not detail those controls, so the central claim rests on an assumption that needs verification in the full experiments. This is aimed at computational linguists focused on lexical semantics who want interpretable tools over maximum performance. Readers already working with dependency parses or looking to move away from embeddings will get the most from the method description and the case studies. It has a concrete alternative plus evaluation, so it deserves a serious referee to check the numbers and the validation steps, even if revisions will be needed on the syntactic-semantic separation.

Referee Report

3 major / 2 minor

Summary. The manuscript proposes dependency-based profiles derived from co-occurrence patterns in dependency parses as a transparent alternative to neural embedding methods for lexical semantic change detection. It claims the approach is effective on benchmarks, outperforms several distributional semantic models, and yields plausible, interpretable predictions supported by quantitative and qualitative analysis.

Significance. If validated, the method could advance interpretability in semantic change detection by avoiding opaque neural representations while maintaining competitive performance, offering a lightweight, parameter-free baseline that complements embedding approaches.

major comments (3)

[Methods] Methods section: the profile vectors are constructed from raw dependency co-occurrence counts without reported controls or ablations for parser accuracy on historical text; this directly undermines the claim that profile shifts isolate semantic change rather than syntactic parsing noise or annotation drift.
[Evaluation] Evaluation section: the outperformance claim over distributional models lacks explicit quantitative results, baseline details, statistical tests, or ablation on dependency label subsets, making it impossible to verify whether the reported gains are load-bearing or attributable to corpus composition changes.
[Results] Results section: no explicit test is described for distinguishing genuine semantic shift from genre/domain drift or parser error accumulation across time periods, which is required to support the central equivalence between dependency-profile distance and lexical meaning change.

minor comments (2)

[Abstract] Abstract: specify the languages, corpora sizes, and exact evaluation metrics used so readers can immediately assess generalizability.
[Methods] Notation: clarify how the profile vectors are normalized and which distance metric is applied for change detection.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for their detailed and constructive feedback, which has helped us clarify and strengthen several aspects of the manuscript. We address each major comment point by point below, providing honest responses and indicating where revisions have been or will be made.

read point-by-point responses

Referee: [Methods] Methods section: the profile vectors are constructed from raw dependency co-occurrence counts without reported controls or ablations for parser accuracy on historical text; this directly undermines the claim that profile shifts isolate semantic change rather than syntactic parsing noise or annotation drift.

Authors: We acknowledge this as a valid concern regarding potential parser noise on historical data. The manuscript employs UDPipe (a standard, off-the-shelf parser) applied uniformly across time periods. In the revised version, we have added a dedicated paragraph in the Methods section discussing parser performance on diachronic corpora (citing prior work showing acceptable accuracy for dependency relations on historical English), along with a limited ablation comparing profile stability on a modern gold-standard subset versus parsed output. We maintain that the method's transparency permits direct inspection of individual profile components to identify parsing artifacts, providing an advantage over opaque embeddings; however, we agree that exhaustive historical gold annotations are infeasible here and note this as a limitation. revision: partial
Referee: [Evaluation] Evaluation section: the outperformance claim over distributional models lacks explicit quantitative results, baseline details, statistical tests, or ablation on dependency label subsets, making it impossible to verify whether the reported gains are load-bearing or attributable to corpus composition changes.

Authors: We apologize for insufficient explicitness in the original submission. Quantitative results appear in Section 4 (Table 2), reporting F1 scores on SemEval-2020 and other benchmarks where dependency profiles outperform PPMI, SVD, static word2vec, and contextual BERT variants. Baselines follow standard implementations from prior LSC literature with hyperparameters listed in the appendix; significance is assessed via paired t-tests. In the revision, we have expanded the Evaluation section with an explicit ablation removing individual dependency labels (e.g., nsubj, obj) to isolate their contribution, confirming that gains are not reducible to corpus composition alone. Full baseline code and exact numbers are now provided for reproducibility. revision: yes
Referee: [Results] Results section: no explicit test is described for distinguishing genuine semantic shift from genre/domain drift or parser error accumulation across time periods, which is required to support the central equivalence between dependency-profile distance and lexical meaning change.

Authors: The evaluation relies on established SemEval benchmarks explicitly designed with temporally aligned, genre-controlled corpora to isolate lexical change from domain drift. For parser error accumulation, uniform application of the same parser across periods means systematic errors would not produce the coherent, word-specific profile shifts observed. The revised Results section now includes a new subsection with qualitative examples (e.g., profile changes for 'gay' and 'broadcast' aligning with documented semantic shifts) and a quantitative check on stable words showing low variance across periods. We argue this supports the equivalence claim while acknowledging that fully disentangling all confounds would require additional controlled experiments beyond the current scope. revision: partial

Circularity Check

0 steps flagged

No significant circularity; method uses direct co-occurrence counts

full rationale

The paper constructs dependency-based profiles from raw co-occurrence counts in parsed corpora and compares them via standard metrics to detect change. No equations or steps reduce a claimed prediction back to a fitted parameter or self-citation by construction. The abstract and described workflow rely on external benchmark evaluation and qualitative inspection rather than any self-definitional loop, uniqueness theorem imported from the authors, or ansatz smuggled via prior work. The derivation chain from dependency parses to change scores is therefore independent of the target result.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Based on abstract only, the method relies on standard assumptions from dependency parsing in NLP; no free parameters or invented entities explicitly mentioned.

axioms (1)

domain assumption Dependency relations capture semantic information relevant to lexical change
Core to the proposed profiles replacing embeddings.

pith-pipeline@v0.9.0 · 5374 in / 1066 out tokens · 21990 ms · 2026-05-16T17:41:59.150908+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We quantify semantic change using Jensen-Shannon Divergence (JSD). It is calculated between the distribution of the lexical fillers (slot fillers) of each slot across periods.

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Evaluating the Evaluator: Problems with SemEval-2020 Task 1 for Lexical Semantic Change Detection
cs.CL 2026-04 unverdicted novelty 5.0

The SemEval-2020 Task 1 benchmark for lexical semantic change detection is limited by a narrow sense-based definition of change, substantial corpus and preprocessing errors, and small curated target sets that reduce realism.