pith. sign in

arxiv: 2606.09724 · v1 · pith:2ZJDOYSHnew · submitted 2026-06-08 · 💻 cs.AI

Beyond Probabilistic Similarity: Structural, Temporal, and Causal Limitations of Retrieval-Augmented Generation in the Legal Domain

Pith reviewed 2026-06-27 16:22 UTC · model grok-4.3

classification 💻 cs.AI
keywords retrieval-augmented generationlegal AIprobabilistic retrievallegal knowledge structuremereological blindnessdiachronic blindnesscausal opacitydeterministic retrieval
0
0 comments X

The pith

RAG systems fail in legal domains because probabilistic retrieval cannot match the hierarchical, temporal, and causal structure of legal knowledge.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper claims that repeated legal AI failures, such as invented citations and outdated norms presented as current, arise from a structural incompatibility rather than insufficient model scale. Legal knowledge is defined by three interlocking properties drawn from classical theory: hierarchical mereological relations among norms, change over time while preserving operational closure, and traceable institutional origins tied to justification duties. These properties produce three retrieval pathologies—mereological blindness to part-whole relations, diachronic blindness to norm evolution, and causal opacity regarding provenance—that current methods treat separately and incompletely. The analysis leads to four required architectural commitments for any adequate legal retrieval system: ontological primacy, event reification, bitemporal correctness, and deterministic interaction protocols.

Core claim

The central claim is that failures of retrieval-augmented generation in law are symptoms of an architectural mismatch between probabilistic similarity-based retrieval and the hierarchical and mereological structure, diachronic dynamism under operational closure, and causal traceability of institutional provenance that constitute legal knowledge; existing approaches address these requirements unevenly and do not yet form a coherent paradigm, so legal retrieval must move toward deterministic-by-design systems organized around ontological primacy, event reification, bitemporal correctness, and deterministic interaction protocols.

What carries the argument

The triad of properties of legal knowledge (hierarchical and mereological structure, diachronic dynamism under operational closure, and causal traceability of institutional provenance grounded in the duty of justification) that generate the three corresponding retrieval pathologies of mereological blindness, diachronic blindness, and causal opacity.

If this is right

  • Existing RAG methods address the three requirements unevenly and fail to compose them into a single paradigm.
  • Legal retrieval systems must prioritize ontological primacy, event reification, bitemporal correctness, and deterministic interaction protocols.
  • The framework targets quaestio juris (which norms apply and in what state) rather than downstream application of those norms.
  • Primary focus is legislative and constitutional retrieval, with interpretive time treated as an explicit extension.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Legal retrieval would shift from ranking documents to maintaining explicit event histories of norm creation, amendment, and repeal.
  • The same mismatch argument could apply to regulatory or compliance domains that also require provenance and temporal validity tracking.
  • Diagnostic datasets could be built by annotating legal corpora with explicit hierarchy links, validity intervals, and source institutions to test the three pathologies directly.

Load-bearing premise

Legal knowledge must be treated as possessing a hierarchical mereological structure, diachronic dynamism under operational closure, and causal traceability of institutional provenance.

What would settle it

A probabilistic RAG system that, without adopting the four deterministic commitments, produces no fabricated citations, no anachronistic norms, and correct institutional provenance across a corpus of legislative and constitutional queries spanning multiple time periods.

read the original abstract

Retrieval-Augmented Generation (RAG) has become a standard architectural response to unreliability in legal AI, yet high-profile failures, including fabricated citations submitted to courts and anachronistic legal content presented as current, continue to appear across jurisdictions. We argue that these failures are not residual confabulations to be eliminated by scaling language models, but symptoms of an architectural mismatch between probabilistic retrieval and the hierarchical, temporal, and institutional structure of legal knowledge. We develop the argument in three moves. First, we articulate the ontological commitment of legal knowledge as a triad of properties derivable from classical legal theory: hierarchical and mereological structure, diachronic dynamism under operational closure, and causal traceability of institutional provenance grounded in the duty of justification. Second, we identify three corresponding pathologies of retrieval (mereological blindness, diachronic blindness, and causal opacity), each developed with an operational definition, a failure mechanism, a canonical example, and detection criteria for diagnostic use. Third, we review the state of the art through this lens, showing that existing approaches address these requirements unevenly and do not yet compose into a paradigm that treats them as co-constitutive. From this analysis we derive four architectural commitments that characterize the deterministic-by-design direction for legal retrieval: ontological primacy, event reification, bitemporal correctness, and deterministic interaction protocols. The framework concerns quaestio juris (which norms apply and in what state) rather than the downstream tasks that act on identified norms, and addresses legislative and constitutional retrieval primarily, with interpretive time as an explicit extension.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper claims that persistent failures in legal RAG (fabricated citations, anachronistic content) are symptoms of an architectural mismatch between probabilistic retrieval and the hierarchical/mereological, diachronic (under operational closure), and causally traceable structure of legal knowledge. It derives this triad from classical legal theory, defines three corresponding pathologies with operational criteria and examples, reviews the state of the art through this lens, and extracts four deterministic architectural commitments (ontological primacy, event reification, bitemporal correctness, deterministic interaction protocols) focused on quaestio juris for legislative/constitutional retrieval.

Significance. If the framework holds, the paper supplies a diagnostic vocabulary and set of design commitments that could reorient legal retrieval research away from scaling toward explicit handling of legal ontology, temporality, and provenance. The explicit operational definitions and detection criteria for the pathologies, together with the derivation of four commitments, constitute a reusable position that could be tested in subsequent system-building work.

major comments (2)
  1. [First move / ontological commitment paragraph] The central move that the triad of properties is 'derivable from classical legal theory' is load-bearing for the entire argument yet is stated without naming specific theorists, texts, or derivation steps in the first move of the paper; this leaves the ontological commitment under-specified and risks appearing ad hoc.
  2. [Third move / state-of-the-art review] The claim that existing approaches 'address these requirements unevenly and do not yet compose into a paradigm' rests on the SOTA review; without an explicit mapping table or systematic scoring of each cited system against the three pathologies, the unevenness conclusion cannot be verified from the text alone.
minor comments (2)
  1. [Abstract and introduction] The abstract and introduction use 'operational closure' and 'duty of justification' without a brief parenthetical gloss on first use; adding one sentence would improve accessibility for an AI audience.
  2. [Derivation of commitments] The four architectural commitments are listed at the end of the abstract but their explicit linkage back to each pathology is not summarized in a single sentence or table; a short mapping would strengthen the closing argument.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for these constructive comments, which identify opportunities to strengthen the explicitness of the argument. We address each major comment below and commit to revisions that directly respond to the concerns raised.

read point-by-point responses
  1. Referee: [First move / ontological commitment paragraph] The central move that the triad of properties is 'derivable from classical legal theory' is load-bearing for the entire argument yet is stated without naming specific theorists, texts, or derivation steps in the first move of the paper; this leaves the ontological commitment under-specified and risks appearing ad hoc.

    Authors: We agree that the first move would benefit from greater specificity to make the derivation transparent. In the revised manuscript we will expand the opening section to name key sources and outline the derivation steps: Hans Kelsen's Pure Theory of Law for the hierarchical and mereological structure of norms; institutional legal theory (e.g., MacCormick and Weinberger) for operational closure and diachronic dynamism; and the duty of justification in Raz and Alexy for causal traceability of institutional provenance. Brief derivation steps will be added showing how each property follows from these foundations. revision: yes

  2. Referee: [Third move / state-of-the-art review] The claim that existing approaches 'address these requirements unevenly and do not yet compose into a paradigm' rests on the SOTA review; without an explicit mapping table or systematic scoring of each cited system against the three pathologies, the unevenness conclusion cannot be verified from the text alone.

    Authors: We accept that an explicit mapping table would make the unevenness claim verifiable. The revised manuscript will include a new table in the state-of-the-art section that scores each reviewed system against the three pathologies using the operational definitions and detection criteria already provided in the paper. This will allow readers to inspect the assessment directly. revision: yes

Circularity Check

0 steps flagged

No significant circularity

full rationale

The paper presents a conceptual position paper with no equations, fitted parameters, predictions, or formal derivations. Its central triad of legal-knowledge properties is explicitly attributed to derivation from external classical legal theory rather than defined in terms of the paper's own outputs. The three pathologies and four architectural commitments are developed as analytical lenses with operational definitions and external examples, without any reduction to self-citation chains or renaming of known results. The argument remains self-contained against external benchmarks of legal theory and observed RAG failures, with no load-bearing steps that collapse by construction to the paper's inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the domain assumption that legal knowledge possesses the stated triad of properties; no free parameters or invented entities are introduced.

axioms (1)
  • domain assumption Legal knowledge possesses hierarchical and mereological structure, diachronic dynamism under operational closure, and causal traceability of institutional provenance grounded in the duty of justification.
    Stated as derivable from classical legal theory in the first move of the argument.

pith-pipeline@v0.9.1-grok · 5814 in / 1373 out tokens · 23528 ms · 2026-06-27T16:22:44.123028+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

19 extracted references · 11 canonical work pages · 2 internal anchors

  1. [1]

    Controlling Authority Retrieval: A Missing Retrieval Objective for Authority-Governed Knowledge

    Ashley, K. D. (2017).Artificial Intelligence and Legal Analytics: New Tools for Law Practice in the Digital Age. Cambridge: Cambridge University Press. Bacellar, A. (2026).Controlling Authority Retrieval: A Missing Retrieval Objective for Authority-Governed Knowledge. arXiv:2604.14488 [cs.IR]. Barnett, S., S. Kurniawan, S. Thudumu, Z. Brannelly, and M. Ab...

  2. [2]

    Brazil (1998).Complementary Law No

    Portal Normas.leg.br, consolidated text.URL: https://normas.leg.br/?urn=urn:lex:br:federal:constituicao:1988-10- 05;1988(visited on 06/08/2026). Brazil (1998).Complementary Law No. 95 of 26 February 1998: Provides for the consolidation, codifica- tion, and drafting of laws. Portal Normas.leg.br.URL: https://normas.leg.br/?urn=urn:lex: br:federal:lei.compl...

  3. [3]

    Brazil (2010).Constitutional Amendment No

    Presidência da República.URL: https://normas.leg.br/?urn=urn:lex:br:federal:emenda.constitucional:2000-02- 14;26(visited on 06/08/2026). Brazil (2010).Constitutional Amendment No. 64 of 4 February

  4. [4]

    Brazil (2015).Constitutional Amendment No

    Presidência da República.URL: https://normas.leg.br/?urn=urn:lex:br:federal:emenda.constitucional:2010-02- 04;64(visited on 06/08/2026). Brazil (2015).Constitutional Amendment No. 90 of 15 September

  5. [5]

    31 Brazil (2021).Constitutional Amendment No

    Presidência da República.URL: https://normas.leg.br/?urn=urn:lex:br:federal:emenda.constitucional:2015-09- 15;90(visited on 06/08/2026). 31 Brazil (2021).Constitutional Amendment No. 114 of 16 December

  6. [6]

    Brazil (2023).Constitutional Amendment No

    Presidência da República.URL: https://normas.leg.br/?urn=urn:lex:br:federal:emenda.constitucional:2021-12- 16;114(visited on 06/08/2026). Brazil (2023).Constitutional Amendment No. 132 of 20 December

  7. [7]

    Legislative Knowledge Management with Property Graphs

    Presidência da República.URL: https://normas.leg.br/?urn=urn:lex:br:federal:emenda.constitucional:2023-12- 20;132(visited on 05/28/2026). Cai, L., X. Mao, Y . Zhou, Z. Long, C. Wu, and M. Lan (2024).A Survey on Temporal Knowledge Graph: Representation Learning and Applications. arXiv:2403.04782 [cs.CL]. Colombo, A., F. Cambria, and F. Invernici (2025). “L...

  8. [8]

    Council Conclusions Inviting the Introduction of the European Legislation Identifier (ELI)

    CEUR Workshop Proceedings. CEUR-WS.org.URL: https://ceur- ws.org/Vol- 3946/TGD- 1.pdf(visited on 05/28/2026). Colombo, P., T. P. Pires, M. Boudiaf, D. Culver, R. Melo, C. Corro, A. F. T. Martins, F. Esposito, V . L. Raposo, S. Morgado, and M. Desa (2024a).SaulLM-7B: A Pioneering Large Language Model for Law. arXiv:2403.03883 [cs.CL]. Colombo, P., T. P. Pi...

  9. [9]

    Deterministic Legal Agents: A Canonical Primitive API for Auditable Reasoning over Temporal Knowledge Graphs

    Frontiers in Artificial Intelligence and Applications. IOS Press.DOI:10.3233/FAIA251598. de Martim, H. (2025b).Deterministic Legal Agents: A Canonical Primitive API for Auditable Reasoning over Temporal Knowledge Graphs. arXiv:2510.06002 [cs.AI]. de Martim, H. (2025c).Modeling the Diachronic Evolution of Legal Norms: An LRMoo-Based, Component- Level, Even...

  10. [11]

    V ol. 10843. Lecture Notes in Computer Science. Springer, pp. 272–287.DOI:10.1007/978-3-319-93417-4_18. 32 Harber v HMRC(Dec. 4, 2023).Harber v Commissioners for His Majesty’s Revenue and Customs. First-tier Tribunal (Tax Chamber), United Kingdom,

  11. [12]

    The LKIF Core Ontology of Basic Legal Concepts

    UKFTT 1007 (TC).URL: https://www. bailii.org/uk/cases/UKFTT/TC/2023/TC09010.html(visited on 04/29/2026). Hart, H. L. A. (2012).The Concept of Law. 3rd ed. Oxford: Oxford University Press. Hoekstra, R., J. Breuker, M. Di Bello, and A. Boer (2007). “The LKIF Core Ontology of Basic Legal Concepts”. In:Proceedings of the Workshop on Legal Ontologies and Artif...

  12. [13]

    SAILER: Structure-Aware Pre-trained Language Model for Legal Case Retrieval

    LexML Brasil (Dec. 2008a).Parte 2 – LexML URN. Projeto LexML Brasil, Versão 1.0.URL: https: //projeto.lexml.gov.br/documentacao/Parte-2-LexML-URN.pdf(visited on 05/07/2026). LexML Brasil (Dec. 2008b).Parte 3 – LexML XML Schema. Projeto LexML Brasil, Versão 1.0.URL: https://projeto.lexml.gov.br/documentacao/Parte- 3- XML- Schema.pdf (visited on 05/07/2026)...

  13. [14]

    Explainable AI and Law: An Evidential Survey

    Boston Studies in the Philosophy of Science. Dordrecht, Holland: D. Reidel Publishing Company. McGregor Richmond, K., S. M. Muddamsetty, T. Gammeltoft-Hansen, H. Palmer Olsen, and T. B. Moeslund (2024). “Explainable AI and Law: An Evidential Survey”. In:Digital Society3.1, p. 1.DOI: 10.1007/s44206-023-00081-z. OASIS LegalDocumentML TC (Aug. 29, 2018).Akom...

  14. [15]

    LegalRuleML: XML-Based Rules and Norms

    Law, Governance and Technology Series. Dordrecht: Springer, pp. 101–130.DOI:10.1007/978-94-007-1887-6_7. Palmirani, M., G. Governatori, A. Rotolo, S. Tabet, H. Boley, and A. Paschke (2011). “LegalRuleML: XML-Based Rules and Norms”. In:Rule-Based Modeling and Computing on the Semantic Web (RuleML 2011). V ol

  15. [16]

    Understanding the User Experience of Customer Service Chatbots: What Can We Learn from Customer Satisfaction Surveys?Chatbot Re- search and Design (CONVERSATIONS 2020)

    1007/978-3-642-24908-2_21. Palmirani, M. and F. Vitali (2011). “Akoma-Ntoso for Legal Documents”. In:Legislative XML for the Semantic Web: Principles, Models, Standards for Document Management. Ed. by G. Sartor, M. Palmirani, E. Francesconi, and M. A. Biasiotti. Dordrecht: Springer, pp. 75–100.DOI: 10.1007/978- 94-007-1887-6_6. Pan, S., L. Luo, Y . Wang, ...

  16. [17]

    Towards Reliable Retrieval in RAG Systems for Large Legal Datasets

    Publications Office of the European Union (2026).ELI Ontology. EU V ocabularies.URL: https://op. europa.eu/en/web/eu-vocabularies/dataset/-/resource?uri=http://publications. europa.eu/resource/dataset/eli(visited on 05/07/2026). Reuter, M., T. Lingenberg, R. Liepi n, a, F. Lagioia, M. Lippi, G. Sartor, A. Passerini, and B. Sayin (2025). “Towards Reliable ...

  17. [18]

    OG-RAG: Ontology-Grounded Retrieval-Augmented Generation for Large Language Models

    Sharma, K., P. Kumar, and Y . Li (2025). “OG-RAG: Ontology-Grounded Retrieval-Augmented Generation for Large Language Models”. In:Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing (EMNLP). Suzhou, China: Association for Computational Linguistics, pp. 32962–32981.DOI:10.18653/v1/2025.emnlp-main.1674. Simons, P. M. (1987...

  18. [19]

    Orthogonal Representations for Robust Context-Dependent Task Performance in Brains and Neural Networks

    Decision by Tribunal Superior Eleitoral, Reporting Justice Antonio Carlos Ferreira, unanimously upholding a fine for litigância de má-fé (bad-faith litigation) under Art. 80(II) of the Brazilian Code of Civil Procedure, for citing non-existent case law allegedly generated by AI. The case has since become the leading TSE precedent on this matter. Primary c...

  19. [20]

    A reasoning-focused legal retrieval benchmark,

    URL: https://www.canlii.org/en/bc/bcsc/doc/2024/2024bcsc285/2024bcsc285.html (visited on 04/29/2026). Zheng, L., N. Guha, J. Arifov, S. Zhang, M. Skreta, C. D. Manning, P. Henderson, and D. E. Ho (2025). “A Reasoning-Focused Legal Retrieval Benchmark”. In:Proceedings of the 2025 Symposium on Computer Science and Law (CSLAW). ACM.DOI:10.1145/3709025.371221...