Structural Dilemmas and Developmental Pathways of Legal Argument Mining in the Era of Artificial Intelligence

Chuanyi Li; Kun Chen; Xianglei Liao

arxiv: 2605.02308 · v1 · submitted 2026-05-04 · 💻 cs.CL

Structural Dilemmas and Developmental Pathways of Legal Argument Mining in the Era of Artificial Intelligence

Xianglei Liao , Chuanyi Li , Kun Chen This is my paper

Pith reviewed 2026-05-08 19:06 UTC · model grok-4.3

classification 💻 cs.CL

keywords legal argument miningartificial intelligenceargumentation theorydata standardizationdomain adaptationlarge language modelslegal dogmaticscomputational feasibility

0 comments

The pith

Legal argument mining develops slowly primarily because it lacks a structured representational approach reconciling theoretical expressiveness with computational feasibility.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper reviews research on legal argument mining, which links legal texts to AI-driven analysis. Progress has taken place along data resources, technological shifts from rules-based systems to large language models, and theoretical inputs from argumentation theory and legal dogmatics. The authors conclude that overall development remains slow not chiefly from data scarcity or technical limits but from the deeper absence of a structured representational approach that balances legal theory with computational practicality. This gap produces concrete problems in standardizing data, building workable models, and adapting methods to legal domains. The study maps out directions for future work to reframe these issues.

Core claim

Despite ongoing progress, the overall development of legal argument mining remains relatively slow. Building on a systematic review of existing research, this study conducts an in-depth analysis and finds that this is due not only to data scarcity or technical limitations, but more fundamentally to the lack of a structured representational approach that reconciles theoretical expressiveness with computational feasibility. Specifically, this challenge manifests in dilemmas in data standardization, obstacles to effective modeling, and limitations in domain adaptation.

What carries the argument

A structured representational approach that reconciles theoretical expressiveness with computational feasibility for legal arguments.

If this is right

Developing a structured representational approach would directly address the identified dilemmas in data standardization.
Such an approach would reduce obstacles to effective modeling of legal arguments with current AI tools.
It would ease limitations in adapting techniques across different legal domains and systems.
Focusing future research on this representational gap would provide a clearer pathway for advancing the field.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

A workable representation could serve as a bridge for combining legal expertise with AI techniques more productively.
Benchmark experiments that introduce and test candidate representations on existing legal corpora would offer a direct way to assess the claim.
The same type of representational shortfall may hinder argument mining in other specialized domains such as medicine or finance.

Load-bearing premise

That the slow development is more fundamentally caused by the absence of a structured representational approach rather than other factors such as limited funding, low interdisciplinary collaboration, or insufficient real-world testing.

What would settle it

If a new structured representation for legal arguments is developed and adopted, and measurable gains then appear in data standardization, modeling effectiveness, and domain adaptation, this would support the claim; continued slow progress despite such a representation would challenge it.

read the original abstract

Against the backdrop of rapid advances in artificial intelligence, legal argument mining has emerged as an important research area linking legal texts with intelligent analysis, carrying significant theoretical and practical implications. Existing studies have primarily developed along three dimensions: data, technology, and theory. At the data level, raw legal texts and annotated corpora constitute the foundational resources. At the technological level, research paradigms have evolved from rule-based systems and traditional machine learning to large language models (LLMs). At the theoretical level, argumentation theory and legal dogmatics provide important references for modeling argumentation structures. However, despite ongoing progress, the overall development of legal argument mining remains relatively slow. Building on a systematic review of existing research, this study conducts an in-depth analysis and finds that this is due not only to data scarcity or technical limitations, but more fundamentally to the lack of a structured representational approach that reconciles theoretical expressiveness with computational feasibility. Specifically, this challenge manifests in dilemmas in data standardization, obstacles to effective modeling, and limitations in domain adaptation. In response, the study proposes several key directions for future research. It aims to provide a reframing of key problems and a pathway for future development in legal argument mining, while leaving specific models and implementation schemes for further investigation.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This is a literature review that organizes legal argument mining work around data, tech, and theory but asserts without metrics that missing structured representations are the main cause of slow progress.

read the letter

This paper is a systematic review of legal argument mining that pulls together existing lines of work on data resources, modeling approaches from rules to LLMs, and theoretical foundations from argumentation theory. It frames the field's relatively slow development as stemming more fundamentally from the absence of a representational approach that balances expressiveness and computational needs, showing up in standardization dilemmas, modeling obstacles, and adaptation limits. The synthesis is clear and the proposed future directions give the subfield some structure for next steps. That part is useful for anyone trying to map the area quickly. The central claim that the representational gap is more fundamental than data scarcity or technical limits rests on the authors' reading of the reviewed papers rather than any new measurements. There are no publication trend numbers, benchmark improvement timelines, or comparisons to fields that adopted better representations to show why this factor should be prioritized. The argument stays interpretive. Readers already in legal informatics or AI-law will get value from the organized overview and the reframing of challenges. Someone looking for new methods, datasets, or tested predictions will not find them here. The work shows honest engagement with the cited literature and clear thinking about the problems, even if the causality judgment is not demonstrated. It deserves a serious referee because a well-executed review can help organize a subfield, provided the authors are asked to qualify the strength of their causal claim and perhaps add any available quantitative context from the literature they reviewed.

Referee Report

2 major / 2 minor

Summary. The manuscript conducts a systematic review of legal argument mining research organized along data, technology, and theory dimensions. It claims that despite incremental progress, overall development remains relatively slow, and attributes this not only to data scarcity or technical limits but more fundamentally to the absence of a structured representational approach reconciling theoretical expressiveness with computational feasibility. This gap is said to produce specific dilemmas in data standardization, obstacles to effective modeling, and limitations in domain adaptation. The paper catalogs existing work and outlines high-level future research directions without proposing concrete models or implementations.

Significance. The systematic review synthesizes literature across three axes and could serve as a useful reference for the field. If the central interpretive claim were supported by additional evidence, it would offer a reframing that might help prioritize representational research in legal argument mining, potentially aiding integration with LLMs and argumentation theory. The review itself provides no new experiments, metrics, or formal derivations.

major comments (2)

[Abstract and in-depth analysis] Abstract and in-depth analysis section: the assertion that lack of a structured representational approach is the 'more fundamental' cause of slow development (beyond data scarcity or technical limitations) is presented as a key finding from the systematic review, yet no quantitative support such as publication trend metrics, benchmark improvement rates over time, adoption statistics, or comparisons to other NLP subfields is supplied to justify the causal prioritization. This interpretive weighting is load-bearing for the thesis but rests solely on qualitative synthesis of cited work.
[Analysis of challenges] The section discussing manifestations of the challenge: claims of dilemmas in data standardization, modeling obstacles, and domain adaptation limits are cataloged from existing literature but not accompanied by any new comparative analysis or falsifiable predictions that would elevate the review's conclusions beyond assertion.

minor comments (2)

[Future directions] The proposed future research directions are described at a high level without concrete examples, pilot study outlines, or references to specific representational frameworks from adjacent fields that could be adapted.
[Introduction] Terminology such as 'structured representational approach' is used repeatedly but would benefit from an explicit early definition or illustrative example drawn from argumentation theory or computational linguistics to improve accessibility.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback on our manuscript. We address each major comment point by point below, indicating planned revisions where appropriate while preserving the review's core interpretive synthesis.

read point-by-point responses

Referee: [Abstract and in-depth analysis] Abstract and in-depth analysis section: the assertion that lack of a structured representational approach is the 'more fundamental' cause of slow development (beyond data scarcity or technical limitations) is presented as a key finding from the systematic review, yet no quantitative support such as publication trend metrics, benchmark improvement rates over time, adoption statistics, or comparisons to other NLP subfields is supplied to justify the causal prioritization. This interpretive weighting is load-bearing for the thesis but rests solely on qualitative synthesis of cited work.

Authors: We acknowledge that the prioritization of the representational gap as more fundamental rests on qualitative synthesis of the reviewed literature rather than new quantitative metrics. As a systematic review, the manuscript organizes existing work along data, technology, and theory axes to surface patterns that support this interpretive claim; quantitative trend analyses or cross-subfield benchmarks fall outside the scope of the current work and would require a separate empirical study. To address the concern, we will revise the abstract and in-depth analysis section to frame the claim more explicitly as an interpretive conclusion drawn from the synthesis, with additional citations illustrating how data and technical limitations have been repeatedly linked to representational shortcomings in the cited papers. We believe this clarification strengthens the presentation without altering the thesis. revision: partial
Referee: [Analysis of challenges] The section discussing manifestations of the challenge: claims of dilemmas in data standardization, modeling obstacles, and domain adaptation limits are cataloged from existing literature but not accompanied by any new comparative analysis or falsifiable predictions that would elevate the review's conclusions beyond assertion.

Authors: The discussion of dilemmas in data standardization, modeling obstacles, and domain adaptation is intentionally a synthesis and cataloging of challenges reported across the reviewed literature, consistent with the goals of a systematic review. No new comparative analysis or falsifiable predictions are introduced because the paper does not conduct original experiments. In the future research directions section we already outline high-level pathways; we will expand this section to include concrete suggestions for comparative studies (e.g., cross-domain annotation consistency metrics) and example falsifiable predictions that subsequent modeling work could test. This addition will better connect the cataloged challenges to actionable next steps while remaining within the review format. revision: partial

Circularity Check

0 steps flagged

No circularity; literature review with interpretive claims independent of self-referential reductions

full rationale

This is a systematic review paper that catalogs prior work on legal argument mining across data, technology, and theory, then offers an interpretive diagnosis of slow progress. No equations, fitted parameters, predictions, or derivations appear in the text. The central claim that the absence of a structured representational approach is the 'more fundamental' cause is presented as an analysis of external literature rather than a result forced by the paper's own definitions, fits, or self-citations. All load-bearing statements reference cited external sources without reducing to internal loops or renaming known results as new derivations. The paper is therefore self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The paper rests on the domain assumption that its review accurately identifies the root cause of slow progress and that addressing representational structure will unlock development; no free parameters or new entities are introduced.

axioms (1)

domain assumption The slow development of legal argument mining stems more fundamentally from the lack of a structured representational approach reconciling theory and computation than from data scarcity or technical limits alone.
Presented in the abstract as the key finding from the systematic review of existing studies.

pith-pipeline@v0.9.0 · 5521 in / 1503 out tokens · 24957 ms · 2026-05-08T19:06:02.115354+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

Foundation/ArithmeticOf.lean (initial Peano object as canonical structured representation) — superficial rhetorical echo only ArithmeticOf.canonical / logicNat_initial echoes

?

echoes
ECHOES: this paper passage has the same mathematical shape or conceptual pattern as the Recognition theorem, but is not a direct formal dependency.

Constructing a Scalable Framework for Legal Argument Structure Representation ... establish 'premise,' 'conclusion,' and 'support' as foundational elements, while introducing a placeholder such as 'other' as an intermediate mechanism for extension.

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

59 extracted references · 59 canonical work pages

[1]

Data: From Raw Texts to Annotated Corpora The quantity and quality of data directly affect the precision and granularity of legal argument mining. From the perspective of existing research paradigms, relevant data can be broadly divided into raw data and annotated data, with the key distinction being whether explicit information about argumentation struct...

work page
[2]

It should be noted that this evolution does not represent a simple replacement of earlier methods by later ones, but rather a shift in research focus and technological mainstream

Technology: From Rule-Based Methods to LLMs From a technical perspective, the evolution of legal argument mining can be broadly divided into three stages: an early rule-based phase, a middle stage driven by traditional machine learning methods grounded in feature engineering, and a more recent shift toward deep learning–based approaches with LLMs at their...

work page
[3]

Theory: From Data-Driven Approaches to the Integration of Domain Theory In contrast to the rapid development at the data and technical levels, systematic theoretical construction has long remained relatively weak in research on legal argument mining. Previous studies have largely focused on improving model performance and implementing specific tasks, whil...

work page 2007
[4]

At the data level, legal text resources have become increasingly abundant, and annotated corpora have gradually accumulated

Summary Legal argument mining has made notable progress across the three dimensions of data, technology, and theory (see Table 1). At the data level, legal text resources have become increasingly abundant, and annotated corpora have gradually accumulated. At the technical level, research has evolved from rule-based methods to traditional machine learning ...

work page
[5]

abundant raw data but insufficient structural information, and high-value annotated data but limited in scale

Challenges in Data and Annotation Standards Legal argument mining first faces fundamental constraints at the level of data and annotation. As previously discussed, the current state of data in argument mining can be characterized as “abundant raw data but insufficient structural information, and high-value annotated data but limited in scale.” For fine-gr...

work page
[6]

premise–conclusion

Challenges in Structural Representation and Computational Modeling In addition to data constraints, a core difficulty in legal argumentation mining lies in the tension between structural representation and computational modeling. On the one hand, legal argumentation theory has developed a variety of fine-grained structural models, such as the Toulmin mode...

work page
[7]

Challenges in Domain Adaptation and Evaluation Argument mining in the legal domain faces significant cross-domain adaptation challenges. As an interdisciplinary field situated at the intersection of law and artificial intelligence, legal argument mining operates within a normative practice in which legal texts are shaped by institutional constraints, ofte...

work page
[8]

representation layer

Structural Root Cause: The Absence of an Intermediate Representational Layer Taken together, the three dimensions discussed above—data, modeling, and domain adaptation—are not isolated problems. Rather, they converge on a deeper structural issue: the absence of a stable “representation layer” in legal argument mining, namely, a structured intermediate rep...

work page
[9]

premise–conclusion

Constructing a Scalable Framework for Legal Argument Structure Representation To begin with, it is necessary to introduce an extensible structured representation framework between theory and computation, in order to provide a unified description of the basic elements of legal argumentation and their interrelations. Unlike traditional representation scheme...

work page
[10]

data silos

Promoting the Development of Standardized Annotated Corpora for Legal Argument At present, research on annotated corpora in legal argument mining has primarily focused on increasing annotation volume and corpus size, resulting in the development of multiple corpora of different types and sizes [40]. However, the annotation schemes underlying these corpora...

work page
[11]

However, the identification of legal argumentative elements and argumentative structures requires a deeper understanding of the relations among different components

Strengthening Domain Knowledge–Driven Computational Models The identification of information such as parties, issues in dispute, statutory provisions, and factual circumstances by machine systems can largely be achieved based on the actual content and 16 structure of legal data itself. However, the identification of legal argumentative elements and argume...

work page
[12]

legal expert – computer scientist – machine

Exploring a Collaborative Research Paradigm among Legal Experts, Computational Scientists, and Machines Human–machine collaboration is a fundamental principle for the application of artificial intelligence in advancing the rule of law. However, how to allocate tasks between humans and machines has long remained a difficult problem. With the rapid developm...

work page
[13]

Prior studies are largely organized around individual tasks, such as argument identification, proposition classification, and relation detection

From Task-Oriented Approaches to Systematic Research Agendas From the perspective of the overall research paradigm, legal argument mining should shift away from a task-centric research mode toward a more systematic research method centered on structured representation. Prior studies are largely organized around individual tasks, such as argument identific...

work page 2009
[14]

Seena Fazel, et al.,The predictive performance of criminal risk assessment tools used atsentencing: Systematic review of validation studies, Journal of Criminal Justice, Vol.81,2022,101902

work page 2022
[15]

Artificial Intelligence and Law,Vol

Masha Medvedeva, Michel Vols & Martijn Wieling , Using machine learning to predict decisions of the European Court of Human Rights. Artificial Intelligence and Law,Vol. 28, 2020, pp.237-266

work page 2020
[16]

Domain Theory

Lusheng Wang, On the Construction of “Domain Theory” of Legal Big Data(in Chinese), China Legal Science,No.2, 2020, pp.268-269

work page 2020
[17]

Marie-Francine Moens & Caroline Uyttendaele, Automatic Text Structuring and Categorization as a First Step in Summarizing Legal Cases, Information Processing & Management, Vol.33, 1997, pp.727-737

work page 1997
[18]

Marie-Francine Moens, et al., Automatic detection of arguments in legal texts, in: Proceedings of the 11th International Conference on Artificial Intelligence and Law, ACM Press, 2007, pp. 225-230

work page 2007
[19]

33, 2025

Gaspar Dugac & Tilmann Altwicker, Classifying legal interpretations using large language models, Artificial Intelligence and Law,Vol. 33, 2025

work page 2025
[20]

Raquel Mochales Palau & Marie-Francine Moens, Argumentation Mining: The Detection, Classification and Structure of Arguments in Text, in Proceedings of the 11th International Conference on Artificial Intelligence and Law , ACM Press, 2009, pp. 98-107

work page 2009
[21]

Farley, A model of argumentation and its application to legal reasoning, Artificial Intelligence and Law, Vol.4, pp.163–197

Kathleen Freeman & Arthur M. Farley, A model of argumentation and its application to legal reasoning, Artificial Intelligence and Law, Vol.4, pp.163–197

work page
[22]

Giulia Grundler, et al., Detecting arguments in CJEU decisions on fiscal state aid, in: Proceedings of the 9th Workshop on Argument Mining, 2022, pp.143-157

work page 2022
[23]

Raquel Mochales Palau & Marie-Francine Moens, Study on the structure of argumentation in case law, in: Proceedings of the 21st International Conference on Legal Knowledge and Information Systems, IOS Press, 2008, pp.11-20

work page 2008
[24]

Ashley, Using argument mining for legal text summarization, in: Legal Knowledge and Information Systems, Vol

Huihui Xu, Jaromír Šavelka & Kevin D. Ashley, Using argument mining for legal text summarization, in: Legal Knowledge and Information Systems, Vol. 334, IOS Press, 2020, pp.184-193

work page 2020
[25]

4, 2012, pp.38-64

Douglas Walton, Argument mining by applying argumentation schemes, Studies in Logic,Vol. 4, 2012, pp.38-64

work page 2012
[26]

Jian Yuan, et al., Overview of SMP-CAIL2020-Argmine: The Interactive Argument-Pair Extraction in Judgement Document Challenge, Data Intelligence,Vol.3, 2021, pp.287-307

work page 2021
[27]

27, 2019, pp.141-170

Hiroaki Yamada, Simone Teufel & Takenobu Tokunaga, Building a corpus of legal argumentation in Japanese judgement documents: towards structure-based summarisation, Artificial Intelligence and Law,Vol. 27, 2019, pp.141-170

work page 2019
[28]

R¯uta Liepina, et al., Legal argument mining: recent trends and open challenges, in: Proceedings of the First Argument Mining and Empirical Legal Research Workshop, 2025

work page 2025
[29]

Marie-Francine Moens, Argumentation mining: how can a machine acquire common sense and world knowledge?, Argument & Computation,Vol.9, 2018, pp.1-14

work page 2018
[30]

Hao Li, et al.,Large Language Models in Argument Mining: A Survey, arXiv:2506.16383 [cs.CL]

work page arXiv
[31]

Weikang Yuan, et al.,Can Large Language Models Grasp Legal Theories? : Enhance Legal Reasoning with Insights from Multi-Agent Collaboration,Findings of the Association for Computational Linguistics: EMNLP 2024, pp.7577-7597

work page 2024
[32]

2898–2904

Ilias Chalkidis, et al., LEGAL-BERT: the muppets straight out of law school, in: Findings of the Association for Computational Linguistics: EMNLP 2020, 2020, pp. 2898–2904

work page 2020
[33]

Lucia Zheng, et al., When does pretraining help? Assessing self-supervised learning for law and the CaseHOLD dataset of 53,000+ legal holdings, in: Proceedings of the Eighteenth International Conference on Artificial Intelligence and Law, 2021, pp. 159-168

work page 2021
[34]

Gechuan Zhang, Paul Nulty & David Lillis, Enhancing legal argument mining with domain pre-training and neural networks, Journal of Data Mining and Digital Humanities, 2022

work page 2022
[35]

Ashley, Accounting for sentence position and legal domain sentence embedding in learning to classify case sentences, in: Legal Knowledge and Information Systems, Vol

Huihui Xu, Jaromir Savelka & Kevin D. Ashley, Accounting for sentence position and legal domain sentence embedding in learning to classify case sentences, in: Legal Knowledge and Information Systems, Vol. 346, IOS Press, 2021, pp. 33–42

work page 2021
[36]

Huihui Xu, Jaromir Savelka & Kevin D. Ashley, Toward summarizing case decisions via extracting argument issues, reasons, and conclusions, in: Proceedings of the Eighteenth International Conference on Artificial Intelligence and Law, ACM Press, 2021, pp. 250–254

work page 2021
[37]

Ivan Habernal, et al., Mining legal arguments in court decisions, Artificial Intelligence and Law, Vol.31, 2023, pp.557-594

work page 2023
[38]

Lena Hel & Ivan Habernal, Contemporary LLMs struggle with extracting formal legal arguments, in: Proceedings of the Natural Legal Language Processing Workshop 2025, 2025, pp.292-303

work page 2025
[39]

Purbid Bambroo, et al., MARRO: multi-headed attention for rhetorical role labeling in legal documents, arXiv:2503.10659v1 [cs.CL] 08 Mar 2025

work page arXiv 2025
[40]

van Eemeren ,Rob Grootendorst & A

Frans H. van Eemeren ,Rob Grootendorst & A. Francisca Snoeck Henkemans, Argumentation: analysis, evaluation, presentation, Lawrence Erlbaum Associates, 2002, pp. 64–66

work page 2002
[41]

Kilian Lüders & Bent Stohlmann, Classifying proportionality - identification of a legal argument, Artificial Intelligence and Law,Vol.33, 2025, pp.1051-1078

work page 2025
[42]

Toulmin, The uses of argument, Cambridge University Press, 2003, pp.87-95

Stephen E. Toulmin, The uses of argument, Cambridge University Press, 2003, pp.87-95

work page 2003
[43]

Ashley, Artificial intelligence and legal analytics: new tools for law practice in the digital age, 22 Cambridge University Press, 2017, p

Kevin D. Ashley, Artificial intelligence and legal analytics: new tools for law practice in the digital age, 22 Cambridge University Press, 2017, p. 130

work page 2017
[44]

Giulia Grundler, et al., AMELIA-Argument Mining Evaluation on Legal documents in ItAlian: A CALAMITA challenge, in: Proceedings of the 10th Italian Conference on Computational Linguistics, Pisa, Italy, 2024

work page 2024
[45]

Gordon, Henry Prakken & Douglas Walton, The Carneades model of argument and burden of proof, Artificial Intelligence,Vol

Thomas F. Gordon, Henry Prakken & Douglas Walton, The Carneades model of argument and burden of proof, Artificial Intelligence,Vol. 171, 2007,pp. 875-881

work page 2007
[46]

Douglas Walton, Argument Evaluation and Evidence, Springer International Publishing, 2016, p.126-129

work page 2016
[47]

Catherine Uyttendaele, Marie-Francine Moens & Jos Dumortier, SALOMON: Automatic Abstracting of Legal Cases for Effective Access to Court Decisions, Artificial Intelligence and Law, 1998,Vol.6, pp.59-79

work page 1998
[48]

Raquel Mochales Palau & Marie-Francine Moens, Study on the structure of argumentation in case law, in: Proceedings of the 21st International Conference on Legal Knowledge and Information Systems, IOS Press, 2008, pp. 11–20

work page 2008
[49]

Basit Ali, et al.,Constructing A Dataset of Support and Attack Relations in Legal Arguments in Court Judgements using Linguistic Rules, in: Proceedings of the 13th Conference on Language Resources and Evaluation (LREC 2022), pp.491-500

work page 2022
[50]

Caselaw using LLMs, arXiv:2603.08286 [cs.CL]

Serene Wang, Lavanya Pobbathi & Haihua Chen, LAMUS: A Large-Scale Corpus for Legal Argument Mining from U.S. Caselaw using LLMs, arXiv:2603.08286 [cs.CL]

work page arXiv
[51]

Gechuan Zhang, David Lillis & Paul Nulty, Can domain pre-training help interdisciplinary researchers from data annotation poverty? A case study of legal argument mining with bert-based transformers, in: Proceedings of the Workshop on Natural Language Processing for Digital Humanities, 2021, pp. 121-130

work page 2021
[52]

Feteris,Weighing and Balancing in the Justification of Judicial Decisions, Informal Logic, Vol.28, pp.20-30(2008)

Eveline T. Feteris,Weighing and Balancing in the Justification of Judicial Decisions, Informal Logic, Vol.28, pp.20-30(2008)

work page 2008
[53]

(Eds.): Natural Language Processing and Information Systems, Springer, 2022, pp.240-252

Gechuan Zhang, Paul Nulty & David Lillis, A Decade of Legal Argumentation Mining: Datasets and Approaches, in Paolo Rosso et al. (Eds.): Natural Language Processing and Information Systems, Springer, 2022, pp.240-252

work page 2022
[54]

Kun Chen, et al., Guidelines for the annotation and visualization of legal argumentation structures in Chinese judicial decisions, arXiv:2603.05171

work page arXiv
[55]

Jérémie Cabessa, Hugo Hernault & Umer Mushtaq, Argument Mining with Fine-Tuned Large Language Models,in: Proceedings of the 31st International Conference on Computational Linguistics,2025, pp.6624-6635

work page 2025
[56]

Dezhao Song, et al., Knowledge Graph-Assisted LLM Post-Training for Enhanced Legal Reasoning, arXiv:2601.13806 [cs.CL]

work page arXiv
[57]

John Lawrence & Chris Reed, Argument Mining: A Survey, Computational Linguistics,Vol.45, pp.765-818

work page
[58]

Vanessa Wei Feng & Graeme Hirst, Classifying arguments by scheme, in: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics, 2011, pp.987-996

work page 2011
[59]

Aleksander Smywiński-Pohl & Tomer Libal, Enhancing legal argument retrieval with optimized language model techniques, in: JSAI International Symposium on Artificial Intelligence, Springer, 2024, pp.93-108

work page 2024

[1] [1]

Data: From Raw Texts to Annotated Corpora The quantity and quality of data directly affect the precision and granularity of legal argument mining. From the perspective of existing research paradigms, relevant data can be broadly divided into raw data and annotated data, with the key distinction being whether explicit information about argumentation struct...

work page

[2] [2]

It should be noted that this evolution does not represent a simple replacement of earlier methods by later ones, but rather a shift in research focus and technological mainstream

Technology: From Rule-Based Methods to LLMs From a technical perspective, the evolution of legal argument mining can be broadly divided into three stages: an early rule-based phase, a middle stage driven by traditional machine learning methods grounded in feature engineering, and a more recent shift toward deep learning–based approaches with LLMs at their...

work page

[3] [3]

Theory: From Data-Driven Approaches to the Integration of Domain Theory In contrast to the rapid development at the data and technical levels, systematic theoretical construction has long remained relatively weak in research on legal argument mining. Previous studies have largely focused on improving model performance and implementing specific tasks, whil...

work page 2007

[4] [4]

At the data level, legal text resources have become increasingly abundant, and annotated corpora have gradually accumulated

Summary Legal argument mining has made notable progress across the three dimensions of data, technology, and theory (see Table 1). At the data level, legal text resources have become increasingly abundant, and annotated corpora have gradually accumulated. At the technical level, research has evolved from rule-based methods to traditional machine learning ...

work page

[5] [5]

abundant raw data but insufficient structural information, and high-value annotated data but limited in scale

Challenges in Data and Annotation Standards Legal argument mining first faces fundamental constraints at the level of data and annotation. As previously discussed, the current state of data in argument mining can be characterized as “abundant raw data but insufficient structural information, and high-value annotated data but limited in scale.” For fine-gr...

work page

[6] [6]

premise–conclusion

Challenges in Structural Representation and Computational Modeling In addition to data constraints, a core difficulty in legal argumentation mining lies in the tension between structural representation and computational modeling. On the one hand, legal argumentation theory has developed a variety of fine-grained structural models, such as the Toulmin mode...

work page

[7] [7]

Challenges in Domain Adaptation and Evaluation Argument mining in the legal domain faces significant cross-domain adaptation challenges. As an interdisciplinary field situated at the intersection of law and artificial intelligence, legal argument mining operates within a normative practice in which legal texts are shaped by institutional constraints, ofte...

work page

[8] [8]

representation layer

Structural Root Cause: The Absence of an Intermediate Representational Layer Taken together, the three dimensions discussed above—data, modeling, and domain adaptation—are not isolated problems. Rather, they converge on a deeper structural issue: the absence of a stable “representation layer” in legal argument mining, namely, a structured intermediate rep...

work page

[9] [9]

premise–conclusion

Constructing a Scalable Framework for Legal Argument Structure Representation To begin with, it is necessary to introduce an extensible structured representation framework between theory and computation, in order to provide a unified description of the basic elements of legal argumentation and their interrelations. Unlike traditional representation scheme...

work page

[10] [10]

data silos

Promoting the Development of Standardized Annotated Corpora for Legal Argument At present, research on annotated corpora in legal argument mining has primarily focused on increasing annotation volume and corpus size, resulting in the development of multiple corpora of different types and sizes [40]. However, the annotation schemes underlying these corpora...

work page

[11] [11]

However, the identification of legal argumentative elements and argumentative structures requires a deeper understanding of the relations among different components

Strengthening Domain Knowledge–Driven Computational Models The identification of information such as parties, issues in dispute, statutory provisions, and factual circumstances by machine systems can largely be achieved based on the actual content and 16 structure of legal data itself. However, the identification of legal argumentative elements and argume...

work page

[12] [12]

legal expert – computer scientist – machine

Exploring a Collaborative Research Paradigm among Legal Experts, Computational Scientists, and Machines Human–machine collaboration is a fundamental principle for the application of artificial intelligence in advancing the rule of law. However, how to allocate tasks between humans and machines has long remained a difficult problem. With the rapid developm...

work page

[13] [13]

Prior studies are largely organized around individual tasks, such as argument identification, proposition classification, and relation detection

From Task-Oriented Approaches to Systematic Research Agendas From the perspective of the overall research paradigm, legal argument mining should shift away from a task-centric research mode toward a more systematic research method centered on structured representation. Prior studies are largely organized around individual tasks, such as argument identific...

work page 2009

[14] [14]

Seena Fazel, et al.,The predictive performance of criminal risk assessment tools used atsentencing: Systematic review of validation studies, Journal of Criminal Justice, Vol.81,2022,101902

work page 2022

[15] [15]

Artificial Intelligence and Law,Vol

Masha Medvedeva, Michel Vols & Martijn Wieling , Using machine learning to predict decisions of the European Court of Human Rights. Artificial Intelligence and Law,Vol. 28, 2020, pp.237-266

work page 2020

[16] [16]

Domain Theory

Lusheng Wang, On the Construction of “Domain Theory” of Legal Big Data(in Chinese), China Legal Science,No.2, 2020, pp.268-269

work page 2020

[17] [17]

Marie-Francine Moens & Caroline Uyttendaele, Automatic Text Structuring and Categorization as a First Step in Summarizing Legal Cases, Information Processing & Management, Vol.33, 1997, pp.727-737

work page 1997

[18] [18]

Marie-Francine Moens, et al., Automatic detection of arguments in legal texts, in: Proceedings of the 11th International Conference on Artificial Intelligence and Law, ACM Press, 2007, pp. 225-230

work page 2007

[19] [19]

33, 2025

Gaspar Dugac & Tilmann Altwicker, Classifying legal interpretations using large language models, Artificial Intelligence and Law,Vol. 33, 2025

work page 2025

[20] [20]

Raquel Mochales Palau & Marie-Francine Moens, Argumentation Mining: The Detection, Classification and Structure of Arguments in Text, in Proceedings of the 11th International Conference on Artificial Intelligence and Law , ACM Press, 2009, pp. 98-107

work page 2009

[21] [21]

Farley, A model of argumentation and its application to legal reasoning, Artificial Intelligence and Law, Vol.4, pp.163–197

Kathleen Freeman & Arthur M. Farley, A model of argumentation and its application to legal reasoning, Artificial Intelligence and Law, Vol.4, pp.163–197

work page

[22] [22]

Giulia Grundler, et al., Detecting arguments in CJEU decisions on fiscal state aid, in: Proceedings of the 9th Workshop on Argument Mining, 2022, pp.143-157

work page 2022

[23] [23]

Raquel Mochales Palau & Marie-Francine Moens, Study on the structure of argumentation in case law, in: Proceedings of the 21st International Conference on Legal Knowledge and Information Systems, IOS Press, 2008, pp.11-20

work page 2008

[24] [24]

Ashley, Using argument mining for legal text summarization, in: Legal Knowledge and Information Systems, Vol

Huihui Xu, Jaromír Šavelka & Kevin D. Ashley, Using argument mining for legal text summarization, in: Legal Knowledge and Information Systems, Vol. 334, IOS Press, 2020, pp.184-193

work page 2020

[25] [25]

4, 2012, pp.38-64

Douglas Walton, Argument mining by applying argumentation schemes, Studies in Logic,Vol. 4, 2012, pp.38-64

work page 2012

[26] [26]

Jian Yuan, et al., Overview of SMP-CAIL2020-Argmine: The Interactive Argument-Pair Extraction in Judgement Document Challenge, Data Intelligence,Vol.3, 2021, pp.287-307

work page 2021

[27] [27]

27, 2019, pp.141-170

Hiroaki Yamada, Simone Teufel & Takenobu Tokunaga, Building a corpus of legal argumentation in Japanese judgement documents: towards structure-based summarisation, Artificial Intelligence and Law,Vol. 27, 2019, pp.141-170

work page 2019

[28] [28]

R¯uta Liepina, et al., Legal argument mining: recent trends and open challenges, in: Proceedings of the First Argument Mining and Empirical Legal Research Workshop, 2025

work page 2025

[29] [29]

Marie-Francine Moens, Argumentation mining: how can a machine acquire common sense and world knowledge?, Argument & Computation,Vol.9, 2018, pp.1-14

work page 2018

[30] [30]

Hao Li, et al.,Large Language Models in Argument Mining: A Survey, arXiv:2506.16383 [cs.CL]

work page arXiv

[31] [31]

Weikang Yuan, et al.,Can Large Language Models Grasp Legal Theories? : Enhance Legal Reasoning with Insights from Multi-Agent Collaboration,Findings of the Association for Computational Linguistics: EMNLP 2024, pp.7577-7597

work page 2024

[32] [32]

2898–2904

Ilias Chalkidis, et al., LEGAL-BERT: the muppets straight out of law school, in: Findings of the Association for Computational Linguistics: EMNLP 2020, 2020, pp. 2898–2904

work page 2020

[33] [33]

Lucia Zheng, et al., When does pretraining help? Assessing self-supervised learning for law and the CaseHOLD dataset of 53,000+ legal holdings, in: Proceedings of the Eighteenth International Conference on Artificial Intelligence and Law, 2021, pp. 159-168

work page 2021

[34] [34]

Gechuan Zhang, Paul Nulty & David Lillis, Enhancing legal argument mining with domain pre-training and neural networks, Journal of Data Mining and Digital Humanities, 2022

work page 2022

[35] [35]

Ashley, Accounting for sentence position and legal domain sentence embedding in learning to classify case sentences, in: Legal Knowledge and Information Systems, Vol

Huihui Xu, Jaromir Savelka & Kevin D. Ashley, Accounting for sentence position and legal domain sentence embedding in learning to classify case sentences, in: Legal Knowledge and Information Systems, Vol. 346, IOS Press, 2021, pp. 33–42

work page 2021

[36] [36]

Huihui Xu, Jaromir Savelka & Kevin D. Ashley, Toward summarizing case decisions via extracting argument issues, reasons, and conclusions, in: Proceedings of the Eighteenth International Conference on Artificial Intelligence and Law, ACM Press, 2021, pp. 250–254

work page 2021

[37] [37]

Ivan Habernal, et al., Mining legal arguments in court decisions, Artificial Intelligence and Law, Vol.31, 2023, pp.557-594

work page 2023

[38] [38]

Lena Hel & Ivan Habernal, Contemporary LLMs struggle with extracting formal legal arguments, in: Proceedings of the Natural Legal Language Processing Workshop 2025, 2025, pp.292-303

work page 2025

[39] [39]

Purbid Bambroo, et al., MARRO: multi-headed attention for rhetorical role labeling in legal documents, arXiv:2503.10659v1 [cs.CL] 08 Mar 2025

work page arXiv 2025

[40] [40]

van Eemeren ,Rob Grootendorst & A

Frans H. van Eemeren ,Rob Grootendorst & A. Francisca Snoeck Henkemans, Argumentation: analysis, evaluation, presentation, Lawrence Erlbaum Associates, 2002, pp. 64–66

work page 2002

[41] [41]

Kilian Lüders & Bent Stohlmann, Classifying proportionality - identification of a legal argument, Artificial Intelligence and Law,Vol.33, 2025, pp.1051-1078

work page 2025

[42] [42]

Toulmin, The uses of argument, Cambridge University Press, 2003, pp.87-95

Stephen E. Toulmin, The uses of argument, Cambridge University Press, 2003, pp.87-95

work page 2003

[43] [43]

Ashley, Artificial intelligence and legal analytics: new tools for law practice in the digital age, 22 Cambridge University Press, 2017, p

Kevin D. Ashley, Artificial intelligence and legal analytics: new tools for law practice in the digital age, 22 Cambridge University Press, 2017, p. 130

work page 2017

[44] [44]

Giulia Grundler, et al., AMELIA-Argument Mining Evaluation on Legal documents in ItAlian: A CALAMITA challenge, in: Proceedings of the 10th Italian Conference on Computational Linguistics, Pisa, Italy, 2024

work page 2024

[45] [45]

Gordon, Henry Prakken & Douglas Walton, The Carneades model of argument and burden of proof, Artificial Intelligence,Vol

Thomas F. Gordon, Henry Prakken & Douglas Walton, The Carneades model of argument and burden of proof, Artificial Intelligence,Vol. 171, 2007,pp. 875-881

work page 2007

[46] [46]

Douglas Walton, Argument Evaluation and Evidence, Springer International Publishing, 2016, p.126-129

work page 2016

[47] [47]

Catherine Uyttendaele, Marie-Francine Moens & Jos Dumortier, SALOMON: Automatic Abstracting of Legal Cases for Effective Access to Court Decisions, Artificial Intelligence and Law, 1998,Vol.6, pp.59-79

work page 1998

[48] [48]

Raquel Mochales Palau & Marie-Francine Moens, Study on the structure of argumentation in case law, in: Proceedings of the 21st International Conference on Legal Knowledge and Information Systems, IOS Press, 2008, pp. 11–20

work page 2008

[49] [49]

Basit Ali, et al.,Constructing A Dataset of Support and Attack Relations in Legal Arguments in Court Judgements using Linguistic Rules, in: Proceedings of the 13th Conference on Language Resources and Evaluation (LREC 2022), pp.491-500

work page 2022

[50] [50]

Caselaw using LLMs, arXiv:2603.08286 [cs.CL]

Serene Wang, Lavanya Pobbathi & Haihua Chen, LAMUS: A Large-Scale Corpus for Legal Argument Mining from U.S. Caselaw using LLMs, arXiv:2603.08286 [cs.CL]

work page arXiv

[51] [51]

Gechuan Zhang, David Lillis & Paul Nulty, Can domain pre-training help interdisciplinary researchers from data annotation poverty? A case study of legal argument mining with bert-based transformers, in: Proceedings of the Workshop on Natural Language Processing for Digital Humanities, 2021, pp. 121-130

work page 2021

[52] [52]

Feteris,Weighing and Balancing in the Justification of Judicial Decisions, Informal Logic, Vol.28, pp.20-30(2008)

Eveline T. Feteris,Weighing and Balancing in the Justification of Judicial Decisions, Informal Logic, Vol.28, pp.20-30(2008)

work page 2008

[53] [53]

(Eds.): Natural Language Processing and Information Systems, Springer, 2022, pp.240-252

Gechuan Zhang, Paul Nulty & David Lillis, A Decade of Legal Argumentation Mining: Datasets and Approaches, in Paolo Rosso et al. (Eds.): Natural Language Processing and Information Systems, Springer, 2022, pp.240-252

work page 2022

[54] [54]

Kun Chen, et al., Guidelines for the annotation and visualization of legal argumentation structures in Chinese judicial decisions, arXiv:2603.05171

work page arXiv

[55] [55]

Jérémie Cabessa, Hugo Hernault & Umer Mushtaq, Argument Mining with Fine-Tuned Large Language Models,in: Proceedings of the 31st International Conference on Computational Linguistics,2025, pp.6624-6635

work page 2025

[56] [56]

Dezhao Song, et al., Knowledge Graph-Assisted LLM Post-Training for Enhanced Legal Reasoning, arXiv:2601.13806 [cs.CL]

work page arXiv

[57] [57]

John Lawrence & Chris Reed, Argument Mining: A Survey, Computational Linguistics,Vol.45, pp.765-818

work page

[58] [58]

Vanessa Wei Feng & Graeme Hirst, Classifying arguments by scheme, in: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics, 2011, pp.987-996

work page 2011

[59] [59]

Aleksander Smywiński-Pohl & Tomer Libal, Enhancing legal argument retrieval with optimized language model techniques, in: JSAI International Symposium on Artificial Intelligence, Springer, 2024, pp.93-108

work page 2024