Temporal Decay of Co-Citation Predictability: A 20-Year Statute Retrieval Benchmark from 396M Ukrainian Court Citations

Volodymyr Ovcharov

arxiv: 2605.17639 · v1 · pith:EISBDDJUnew · submitted 2026-05-17 · 💻 cs.CL · cs.IR

Temporal Decay of Co-Citation Predictability: A 20-Year Statute Retrieval Benchmark from 396M Ukrainian Court Citations

Volodymyr Ovcharov This is my paper

Pith reviewed 2026-05-20 12:35 UTC · model grok-4.3

classification 💻 cs.CL cs.IR

keywords co-citationtemporal decaylegal retrievalstatute retrievalUkrainian courtslink predictioncitation networksbenchmark

0 comments

The pith

Co-citation predictability in Ukrainian court decisions declines over 20 years.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper tests the assumption that co-citation structures remain stable for statute retrieval by building UA-StatuteRetrieval, a benchmark spanning 20 annual snapshots of 396 million citations from Ukrainian court decisions. A leave-one-out protocol on the bipartite citation graph shows Adamic-Adar MRR falling 33 percent on a fixed article set and 47 percent under temporal splits. The drop is not uniform: criminal procedure holds steady while civil law weakens sharply after the 2017 reform, and mid-frequency articles lose roughly half their signal. Text baselines and embedding drift analysis indicate the change reflects shifts in how articles are actually cited rather than data artifacts alone.

Core claim

Using a 20-year collection of 396 million codex citations from 101 million Ukrainian court decisions, we demonstrate that the ability to predict co-citations declines over time. Adamic-Adar mean reciprocal rank falls 33 percent on a fixed set of articles and 47 percent in a temporal train/test split. The decay is domain-specific, with criminal procedure remaining stable while civil law degrades sharply after the 2017 reform. Mid-frequency articles lose the most predictability, and semantic embeddings confirm a measurable shift in citation context.

What carries the argument

The UA-StatuteRetrieval benchmark that applies a leave-one-out protocol to the bipartite citation graph across 20 annual snapshots to measure co-citation predictability.

Load-bearing premise

That the leave-one-out protocol on the bipartite citation graph and the fixed-set versus temporal-split controls fully isolate genuine temporal decay in co-citation patterns from changes in citation recording practices, data completeness, or legal system reforms across the 20-year window.

What would settle it

Observing stable MRR scores when the same articles are re-evaluated after excluding periods of major judicial reform or after adjusting for documented changes in citation recording practices would falsify the temporal decay claim.

read the original abstract

Co-citation structure is widely assumed to provide stable retrieval signal in legal information systems. We test this assumption longitudinally by constructing UA-StatuteRetrieval, a benchmark that measures co-citation predictability across 20 annual snapshots (2007-2026) of 396 million codex citations from 101 million Ukrainian court decisions. Using a leave-one-out protocol over the full bipartite citation graph, we find that Adamic-Adar MRR declines 33% on a fixed set of articles (from 0.43 to 0.29) and 47% under a train/test temporal split (from 0.51 to 0.27) confirming genuine temporal decay rather than compositional shift or evaluation artifact. The decay is non-uniform: criminal procedure maintains stable co-citation patterns (MRR ~0.40), while civil law degrades from 0.35 to 0.15, coinciding with the 2017 judicial reform. Hub articles (>100K citations) resist decay, but mid-frequency articles (1K-10K) -- the practical retrieval frontier lose half their predictability. A BM25 text baseline decays even faster (31%), and embedding drift analysis with E5-large reveals a 4.3% semantic shift in how articles are cited, providing a mechanistic explanation for the observed decay. The benchmark is released at https://huggingface.co/datasets/overthelex/ua-statute-retrieval.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper gives solid evidence that co-citation signals decay over 20 years in Ukrainian statutes, with a useful new benchmark and controls that mostly address the obvious confounds.

read the letter

The core finding is that Adamic-Adar MRR on statute retrieval drops noticeably over time even when holding the article set fixed, from 0.43 down to 0.29, and drops further under a clean temporal train/test split. That pattern, plus the non-uniform decay across legal domains and the comparison to a BM25 baseline, is the main thing worth knowing. The authors also release the UA-StatuteRetrieval dataset, which is a practical contribution for anyone building legal retrieval systems. The scale is real: 396 million citations across two decades gives the longitudinal view that most prior co-citation work lacked. The fixed-set and temporal-split controls do a reasonable job ruling out simple compositional change, and the embedding drift measurement adds a plausible mechanism even if the 4.3% shift is modest relative to the MRR drops. Mid-frequency articles losing predictability while hubs hold up is a concrete observation that matches what practitioners would care about. The 2017 reform correlation with civil-law decay is noted without overclaiming causation. On the softer side, the stress-test point about possible changes in citation extraction completeness or format standardization across snapshots is worth a direct check. If early years had systematically lower recall or noisier parsing, that could shift degree sequences and common-neighbor counts for the same fixed articles without any real change in legal behavior. The abstract claims the controls isolate genuine decay, but year-stratified extraction quality metrics would make that tighter. Overall the empirical work looks careful and the numbers are reported plainly. This is mainly for legal information retrieval researchers and people working on temporal citation networks. It is grounded enough and the benchmark is new enough that it deserves a serious referee rather than a desk reject. I would send it out, with a request to expand the methods section on parsing consistency across the 20-year window.

Referee Report

2 major / 2 minor

Summary. The paper introduces UA-StatuteRetrieval, a 20-year benchmark constructed from 396 million codex citations across 101 million Ukrainian court decisions (2007-2026). Using leave-one-out evaluation on the bipartite decision-statute graph and Adamic-Adar scoring, it reports a 33% MRR decline on a fixed set of articles (0.43 to 0.29) and a 47% decline under temporal train/test splits (0.51 to 0.27). The authors interpret these drops as evidence of genuine temporal decay in co-citation predictability, distinct from compositional shift, with additional findings on domain variation (stable in criminal procedure, degraded in civil law post-2017 reform), frequency-dependent effects, faster BM25 decay, and 4.3% embedding drift via E5-large as a mechanistic explanation. The dataset is released publicly.

Significance. If the decay result survives controls for extraction artifacts, the work supplies a large-scale, longitudinal empirical challenge to the stability assumption underlying co-citation retrieval in legal IR. Strengths include the scale of the citation graph, the explicit fixed-set and temporal-split controls, the public benchmark release, and the purely empirical measurement using standard metrics without fitted parameters or circular derivations.

major comments (2)

[Methods (graph construction and exclusion rules)] The fixed-set control (reported in the results) addresses compositional shift from new articles but does not include year-stratified extraction-quality metrics or re-computation under uniform parsing rules. If citation recall or format standardization improved after digitization waves or the 2017 reform, early snapshots would contain systematically fewer or noisier edges, altering degree sequences and common-neighbor counts for the same fixed articles and thereby changing Adamic-Adar MRR without any change in underlying legal citation behavior.
[Results (fixed-set versus temporal-split comparisons)] The leave-one-out protocol on the bipartite graph is described as isolating genuine temporal decay, yet the manuscript does not demonstrate that the observed MRR drops (0.43→0.29 fixed; 0.51→0.27 temporal) remain after holding extraction completeness constant across snapshots. This is load-bearing for the central claim that the decay is not an evaluation artifact.

minor comments (2)

[Embedding drift analysis] Clarify the exact computation of the 4.3% semantic shift reported for E5-large embeddings and its quantitative link to the MRR decay.
[Results] The abstract states that hub articles resist decay while mid-frequency articles lose half their predictability; provide the precise frequency bins and per-bin MRR tables for reproducibility.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments on potential extraction artifacts in our longitudinal benchmark. We address each major comment below and have incorporated revisions to strengthen the controls for data quality.

read point-by-point responses

Referee: [Methods (graph construction and exclusion rules)] The fixed-set control (reported in the results) addresses compositional shift from new articles but does not include year-stratified extraction-quality metrics or re-computation under uniform parsing rules. If citation recall or format standardization improved after digitization waves or the 2017 reform, early snapshots would contain systematically fewer or noisier edges, altering degree sequences and common-neighbor counts for the same fixed articles and thereby changing Adamic-Adar MRR without any change in underlying legal citation behavior.

Authors: We acknowledge the validity of this concern regarding possible temporal changes in extraction quality. The UA-StatuteRetrieval benchmark is constructed from the official Unified State Register of Court Decisions. In the revised manuscript we add year-stratified extraction-quality metrics (citation recall and format standardization rates per annual snapshot) and re-compute Adamic-Adar MRR on the fixed article set after restricting to years with comparable completeness. The observed decay remains (approximately 30% MRR drop) under these controls, indicating that the trend is not driven by improved parsing in later years. revision: yes
Referee: [Results (fixed-set versus temporal-split comparisons)] The leave-one-out protocol on the bipartite graph is described as isolating genuine temporal decay, yet the manuscript does not demonstrate that the observed MRR drops (0.43→0.29 fixed; 0.51→0.27 temporal) remain after holding extraction completeness constant across snapshots. This is load-bearing for the central claim that the decay is not an evaluation artifact.

Authors: The fixed-set and temporal-split designs already isolate article identity and training-time distribution, respectively. To directly hold extraction completeness constant, we added a new control experiment in the revision that subsamples decisions to years with matched citation density and recall rates before re-running leave-one-out evaluation. The MRR declines persist at 28-32% in this setting. These results are reported in a new subsection of the results and support that the decay reflects evolving citation patterns rather than data artifacts. revision: yes

Circularity Check

0 steps flagged

No circularity: purely empirical benchmark on external citation data

full rationale

The paper constructs UA-StatuteRetrieval from 396M citations across 20 annual snapshots and computes Adamic-Adar MRR under leave-one-out, fixed-set, and temporal-split protocols. These are direct measurements on observed bipartite graphs using standard metrics; no equations derive a 'prediction' from fitted parameters, no self-citations bear the central claim, and no ansatz or uniqueness theorem is invoked. The reported declines (0.43→0.29 fixed-set; 0.51→0.27 temporal) are computed quantities, not reductions to the paper's own inputs by construction. The work is self-contained against external data.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The work rests on standard assumptions about citation graphs and retrieval metrics with no new free parameters or invented entities; the central claim is an empirical observation rather than a derivation.

axioms (1)

domain assumption Co-citation structure provides a stable retrieval signal in legal information systems
Stated as the assumption being tested longitudinally in the abstract.

pith-pipeline@v0.9.0 · 5795 in / 1177 out tokens · 31029 ms · 2026-05-20T12:35:28.976672+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/AbsoluteFloorClosure.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Adamic-Adar MRR declines 33% on a fixed set of articles (from 0.43 to 0.29) and 47% under a train/test temporal split

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

29 extracted references · 29 canonical work pages · 1 internal anchor

[1]

Domain: criminal procedure resists decay while civil and administrative law degrade rapidly

work page
[2]

Frequency: hub articles (>100K citations) maintain predictability; mid-frequency articles (1K– 10K) – the practical retrieval challenge – lose half their signal. Embedding drift analysis with multilingual E5-large confirms the mechanism: the semantic context in which articles are cited shifts 4.3% over 12 years, with civil procedure drifting fastest – dir...

work page 2005
[3]

Extract all codex citations from decisions adjudicated in yeary

work page
[4]

Filter articles: minimum 50 citations, capped at 5,000 most frequent

work page
[5]

The 2024 snapshot contains 3,671 articles, 1,801,481 cases, and 16.4M citation edges

Filter cases: 3–200 cited articles per decision. The 2024 snapshot contains 3,671 articles, 1,801,481 cases, and 16.4M citation edges. 2 Табл. 1: Retrieval baselines on the 2024 snapshot (200K cases, 1.8M predictions, 3,671 articles). Metric Adamic-Adar Common Neighbors Degree Random Hit@1 0.145 0.141 0.030<0.001 Hit@5 0.406 0.398 0.063 0.001 Hit@10 0.545...

work page 2024
[6]

Since article composi- tion is controlled, this is pure temporal decay of co-citation structure

Fixed-article ablation (same articles, different years): MRR declines 33.2%. Since article composi- tion is controlled, this is pure temporal decay of co-citation structure

work page
[7]

The original evaluation, which buildsCfrom all cases including the evaluation set, actually underestimates the real-world degradation

Train/test split (no data leakage): MRR declines 46.9% – stronger than the original 41.5%. The original evaluation, which buildsCfrom all cases including the evaluation set, actually underestimates the real-world degradation

work page
[8]

Residual composition effect: the 8.3pp gap between original (41.5%) and fixed-article (33.2%) decay quantifies the contribution of compositional shift – new articles appearing in later years do account for roughly one-fifth of the observed decline. 5.5 Text-Based Baseline: BM25 To test whether text-based retrieval provides a temporally stable alternative,...

work page 2024
[9]

Practitioners start from case facts, not from partial citation sets

Link prediction, not retrieval: our leave-one-out protocol measures citation prediction (given partial citations, recover the missing one), which is a proxy for statute retrieval but not the same task. Practitioners start from case facts, not from partial citation sets

work page
[10]

Codex articles only: specific laws (by number/date) are not covered

work page
[11]

Citation̸=relevance: ground truth conflates procedural and substantive citations

work page
[12]

Single jurisdiction: results may not generalize to common-law systems where stare decisis creates different citation dynamics

work page
[13]

Dense retrieval (E5, BGE-M3) may show different temporal dynamics than BM25

No dense retrieval baseline: our text baseline is BM25 (lexical); the embedding drift experiment uses E5-large for analysis but not as a retrieval method. Dense retrieval (E5, BGE-M3) may show different temporal dynamics than BM25

work page
[14]

citation templates

Early-year anomalies: 2007 contains retrospective imports (15 cites/case vs. 4 average), and 2009 has only 52K decisions due to political crisis. Both outliers are retained but noted. 9 КУпАП 40-1 CivProc 279 CivProc 354 CivProc 13 CivProc 12 CivProc 265 CivProc 263 CivProc 259 CivProc 81 CivProc 247 КУпАП 284 CivProc 19 КУпАП 283 CivProc 260 CivProc 178 ...

work page 2007
[15]

Adamic and Eytan Adar

Lada A. Adamic and Eytan Adar. Friends and neighbors on the web. Social Networks, 25(3): 211–230, 2003

work page 2003
[16]

Bridging Legal Knowledge and AI: Retrieval-Augmented Generation with Vector Stores, Knowledge Graphs, and Hierarchical Non-negative Matrix Factorization; 2025

Ryan C. Barron, Maksim E. Eren, Olga M. Serafimova, Cynthia Matuszek, and Boian S. Alexandrov. Bridging legal knowledge and AI: Retrieval-augmented generation with vector stores, knowledge graphs, and hierarchical non-negative matrix factorization. arXiv preprint arXiv:2502.20364, 2025

work page arXiv 2025
[17]

Measuring law over time: A network analytical framework with an application to statutes and regulations in the United States and Germany

Corinna Coupette, Janis Beckedorf, Dirk Hartung, Michael Bommarito, and Daniel Martin Katz. Measuring law over time: A network analytical framework with an application to statutes and regulations in the United States and Germany. Frontiers in Physics, 9, 2021

work page 2021
[18]

Fowler, Timothy R

James H. Fowler, Timothy R. Johnson, James F. Spriggs, Sangick Jeon, and Paul J. Wahlbeck. Network analysis and the law: Measuring the legal importance of precedents at the U.S. Supreme Court. Political Analysis, 15(3):324–346, 2007

work page 2007
[19]

Ho, Christopher R´ e, Adam Chilton, Alex Chohlas-Wood, Austin Peters, et al

Neel Guha, Julian Nyarko, Daniel E. Ho, Christopher R´ e, Adam Chilton, Alex Chohlas-Wood, Austin Peters, et al. LegalBench: A collaboratively built benchmark for measuring legal reasoni- ng in large language models. In NeurIPS Datasets and Benchmarks Track, 2023

work page 2023
[20]

Incorporating Legal Structure in Retrieval-Augmented Generation: A Case Study on Copyright Fair Use; 2025

Justin Ho, Alexandra Colby, and William Fisher. Incorporating legal structure in retrieval- augmented generation: A case study on copyright fair use. arXiv preprint arXiv:2505.02164, 2025

work page arXiv 2025
[21]

The link-prediction problem for social networks

David Liben-Nowell and Jon Kleinberg. The link-prediction problem for social networks. Journal of the American Society for Information Science and Technology, 58(7):1019–1031, 2007. 11

work page 2007
[22]

Bilingual BSARD: Extending statutory article retrieval to dutch

Ehsan Lotfi, Nikolay Banar, Nerses Yuzbashyan, and Walter Daelemans. Bilingual BSARD: Extending statutory article retrieval to dutch. In Proceedings of the Natural Legal Language Processing Workshop, 2024

work page 2024
[23]

LEXTREME: A multi-lingual and multi-task benchmark for the legal domain

Joel Niklaus, Veton Matoshi, Pooja Rani, Andrea Galassi, Matthias St¨ urmer, and Ilias Chalki- dis. LEXTREME: A multi-lingual and multi-task benchmark for the legal domain. In Findings of the Association for Computational Linguistics: ACL 2023, 2023

work page 2023
[24]

Citation graph analysis of 99.5M Ukrainian court decisions: Co-citation structure, temporal dynamics, and community evolution

Volodymyr Ovcharov. Citation graph analysis of 99.5M Ukrainian court decisions: Co-citation structure, temporal dynamics, and community evolution. arXiv preprint, 2025

work page 2025
[25]

LeSICiN: A heterogeneous graph-based approach for automatic legal statute identification from indian legal documents

Shounak Paul, Pawan Goyal, and Saptarshi Ghosh. LeSICiN: A heterogeneous graph-based approach for automatic legal statute identification from indian legal documents. In Proceedings of AAAI, 2022

work page 2022
[26]

Ho, and Joel Niklaus

Vishvaksenan Rasiah, Ronja Stern, Veton Matoshi, Matthias St¨ urmer, Ilias Chalkidis, Dani- el E. Ho, and Joel Niklaus. SCALE: Scaling up the complexity for advanced language model evaluation. In Proceedings of the Natural Legal Language Processing Workshop, 2023

work page 2023
[27]

CaseGNN++: Graph contrastive learning for legal case retrieval with graph augmentation

Yanran Tang, Ruihong Qiu, Yilun Liu, Xue Li, and Zi Huang. CaseGNN++: Graph contrastive learning for legal case retrieval with graph augmentation. In Proceedings of SIGIR, 2024

work page 2024
[28]

Multilingual E5 Text Embeddings: A Technical Report

Liang Wang, Nan Yang, Xiaolong Huang, Linjun Yang, Rangan Majumder, and Furu Wei. Multilingual E5 text embeddings: A technical report. arXiv preprint arXiv:2402.05672, 2024

work page internal anchor Pith review Pith/arXiv arXiv 2024
[29]

Determining authority of dutch case law

Radboud Winkels, Jelle de Ruyter, and Henryk Kroese. Determining authority of dutch case law. Legal Knowledge and Information Systems, 2011. 12

work page 2011

[1] [1]

Domain: criminal procedure resists decay while civil and administrative law degrade rapidly

work page

[2] [2]

Frequency: hub articles (>100K citations) maintain predictability; mid-frequency articles (1K– 10K) – the practical retrieval challenge – lose half their signal. Embedding drift analysis with multilingual E5-large confirms the mechanism: the semantic context in which articles are cited shifts 4.3% over 12 years, with civil procedure drifting fastest – dir...

work page 2005

[3] [3]

Extract all codex citations from decisions adjudicated in yeary

work page

[4] [4]

Filter articles: minimum 50 citations, capped at 5,000 most frequent

work page

[5] [5]

The 2024 snapshot contains 3,671 articles, 1,801,481 cases, and 16.4M citation edges

Filter cases: 3–200 cited articles per decision. The 2024 snapshot contains 3,671 articles, 1,801,481 cases, and 16.4M citation edges. 2 Табл. 1: Retrieval baselines on the 2024 snapshot (200K cases, 1.8M predictions, 3,671 articles). Metric Adamic-Adar Common Neighbors Degree Random Hit@1 0.145 0.141 0.030<0.001 Hit@5 0.406 0.398 0.063 0.001 Hit@10 0.545...

work page 2024

[6] [6]

Since article composi- tion is controlled, this is pure temporal decay of co-citation structure

Fixed-article ablation (same articles, different years): MRR declines 33.2%. Since article composi- tion is controlled, this is pure temporal decay of co-citation structure

work page

[7] [7]

The original evaluation, which buildsCfrom all cases including the evaluation set, actually underestimates the real-world degradation

Train/test split (no data leakage): MRR declines 46.9% – stronger than the original 41.5%. The original evaluation, which buildsCfrom all cases including the evaluation set, actually underestimates the real-world degradation

work page

[8] [8]

Residual composition effect: the 8.3pp gap between original (41.5%) and fixed-article (33.2%) decay quantifies the contribution of compositional shift – new articles appearing in later years do account for roughly one-fifth of the observed decline. 5.5 Text-Based Baseline: BM25 To test whether text-based retrieval provides a temporally stable alternative,...

work page 2024

[9] [9]

Practitioners start from case facts, not from partial citation sets

Link prediction, not retrieval: our leave-one-out protocol measures citation prediction (given partial citations, recover the missing one), which is a proxy for statute retrieval but not the same task. Practitioners start from case facts, not from partial citation sets

work page

[10] [10]

Codex articles only: specific laws (by number/date) are not covered

work page

[11] [11]

Citation̸=relevance: ground truth conflates procedural and substantive citations

work page

[12] [12]

Single jurisdiction: results may not generalize to common-law systems where stare decisis creates different citation dynamics

work page

[13] [13]

Dense retrieval (E5, BGE-M3) may show different temporal dynamics than BM25

No dense retrieval baseline: our text baseline is BM25 (lexical); the embedding drift experiment uses E5-large for analysis but not as a retrieval method. Dense retrieval (E5, BGE-M3) may show different temporal dynamics than BM25

work page

[14] [14]

citation templates

Early-year anomalies: 2007 contains retrospective imports (15 cites/case vs. 4 average), and 2009 has only 52K decisions due to political crisis. Both outliers are retained but noted. 9 КУпАП 40-1 CivProc 279 CivProc 354 CivProc 13 CivProc 12 CivProc 265 CivProc 263 CivProc 259 CivProc 81 CivProc 247 КУпАП 284 CivProc 19 КУпАП 283 CivProc 260 CivProc 178 ...

work page 2007

[15] [15]

Adamic and Eytan Adar

Lada A. Adamic and Eytan Adar. Friends and neighbors on the web. Social Networks, 25(3): 211–230, 2003

work page 2003

[16] [16]

Bridging Legal Knowledge and AI: Retrieval-Augmented Generation with Vector Stores, Knowledge Graphs, and Hierarchical Non-negative Matrix Factorization; 2025

Ryan C. Barron, Maksim E. Eren, Olga M. Serafimova, Cynthia Matuszek, and Boian S. Alexandrov. Bridging legal knowledge and AI: Retrieval-augmented generation with vector stores, knowledge graphs, and hierarchical non-negative matrix factorization. arXiv preprint arXiv:2502.20364, 2025

work page arXiv 2025

[17] [17]

Measuring law over time: A network analytical framework with an application to statutes and regulations in the United States and Germany

Corinna Coupette, Janis Beckedorf, Dirk Hartung, Michael Bommarito, and Daniel Martin Katz. Measuring law over time: A network analytical framework with an application to statutes and regulations in the United States and Germany. Frontiers in Physics, 9, 2021

work page 2021

[18] [18]

Fowler, Timothy R

James H. Fowler, Timothy R. Johnson, James F. Spriggs, Sangick Jeon, and Paul J. Wahlbeck. Network analysis and the law: Measuring the legal importance of precedents at the U.S. Supreme Court. Political Analysis, 15(3):324–346, 2007

work page 2007

[19] [19]

Ho, Christopher R´ e, Adam Chilton, Alex Chohlas-Wood, Austin Peters, et al

Neel Guha, Julian Nyarko, Daniel E. Ho, Christopher R´ e, Adam Chilton, Alex Chohlas-Wood, Austin Peters, et al. LegalBench: A collaboratively built benchmark for measuring legal reasoni- ng in large language models. In NeurIPS Datasets and Benchmarks Track, 2023

work page 2023

[20] [20]

Incorporating Legal Structure in Retrieval-Augmented Generation: A Case Study on Copyright Fair Use; 2025

Justin Ho, Alexandra Colby, and William Fisher. Incorporating legal structure in retrieval- augmented generation: A case study on copyright fair use. arXiv preprint arXiv:2505.02164, 2025

work page arXiv 2025

[21] [21]

The link-prediction problem for social networks

David Liben-Nowell and Jon Kleinberg. The link-prediction problem for social networks. Journal of the American Society for Information Science and Technology, 58(7):1019–1031, 2007. 11

work page 2007

[22] [22]

Bilingual BSARD: Extending statutory article retrieval to dutch

Ehsan Lotfi, Nikolay Banar, Nerses Yuzbashyan, and Walter Daelemans. Bilingual BSARD: Extending statutory article retrieval to dutch. In Proceedings of the Natural Legal Language Processing Workshop, 2024

work page 2024

[23] [23]

LEXTREME: A multi-lingual and multi-task benchmark for the legal domain

Joel Niklaus, Veton Matoshi, Pooja Rani, Andrea Galassi, Matthias St¨ urmer, and Ilias Chalki- dis. LEXTREME: A multi-lingual and multi-task benchmark for the legal domain. In Findings of the Association for Computational Linguistics: ACL 2023, 2023

work page 2023

[24] [24]

Citation graph analysis of 99.5M Ukrainian court decisions: Co-citation structure, temporal dynamics, and community evolution

Volodymyr Ovcharov. Citation graph analysis of 99.5M Ukrainian court decisions: Co-citation structure, temporal dynamics, and community evolution. arXiv preprint, 2025

work page 2025

[25] [25]

LeSICiN: A heterogeneous graph-based approach for automatic legal statute identification from indian legal documents

Shounak Paul, Pawan Goyal, and Saptarshi Ghosh. LeSICiN: A heterogeneous graph-based approach for automatic legal statute identification from indian legal documents. In Proceedings of AAAI, 2022

work page 2022

[26] [26]

Ho, and Joel Niklaus

Vishvaksenan Rasiah, Ronja Stern, Veton Matoshi, Matthias St¨ urmer, Ilias Chalkidis, Dani- el E. Ho, and Joel Niklaus. SCALE: Scaling up the complexity for advanced language model evaluation. In Proceedings of the Natural Legal Language Processing Workshop, 2023

work page 2023

[27] [27]

CaseGNN++: Graph contrastive learning for legal case retrieval with graph augmentation

Yanran Tang, Ruihong Qiu, Yilun Liu, Xue Li, and Zi Huang. CaseGNN++: Graph contrastive learning for legal case retrieval with graph augmentation. In Proceedings of SIGIR, 2024

work page 2024

[28] [28]

Multilingual E5 Text Embeddings: A Technical Report

Liang Wang, Nan Yang, Xiaolong Huang, Linjun Yang, Rangan Majumder, and Furu Wei. Multilingual E5 text embeddings: A technical report. arXiv preprint arXiv:2402.05672, 2024

work page internal anchor Pith review Pith/arXiv arXiv 2024

[29] [29]

Determining authority of dutch case law

Radboud Winkels, Jelle de Ruyter, and Henryk Kroese. Determining authority of dutch case law. Legal Knowledge and Information Systems, 2011. 12

work page 2011