SG-LegalCite: A Principle-Augmented Benchmark for Legal Citation Retrieval in Singapore Law

Chloe Lee En Jia; Kaidong Feng; Shannon Lee Yueh Ern; Yingpeng Du; Zhu Sun

arxiv: 2605.21057 · v1 · pith:3CWX6XO6new · submitted 2026-05-20 · 💻 cs.IR

SG-LegalCite: A Principle-Augmented Benchmark for Legal Citation Retrieval in Singapore Law

Shannon Lee Yueh Ern , Kaidong Feng , Yingpeng Du , Chloe Lee En Jia , Zhu Sun This is my paper

Pith reviewed 2026-05-21 01:58 UTC · model grok-4.3

classification 💻 cs.IR

keywords legal citation retrievalSingapore lawprinciple-augmented retrievalcase precedentlegal information retrievalcommon lawbenchmark datasetlegal NLP

0 comments

The pith

Explicit legal principles provide stronger signals for retrieving relevant precedents than facts alone in Singapore law.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Legal citation retrieval currently relies on factual similarity or full judgments, which often bury or omit the specific legal principle that makes a precedent binding or persuasive. This leads models to surface cases that match on surface details but diverge on the doctrinal point at issue, a problem sharpened in Singapore where only local decisions bind and foreign ones are merely persuasive. The paper constructs SG-LegalCite, a dataset of 100,890 case-principle pairs drawn from 8,523 Singapore Supreme Court judgments, to test a retrieval paradigm that feeds both facts and the explicit governing principle into the query. Across eleven baselines, the principle-augmented approach improves ranking quality by giving models clearer discriminative cues. A sympathetic reader cares because better precedent retrieval directly supports more accurate legal reasoning and reduces the risk of citing mismatched authority.

Core claim

Augmenting retrieval queries with explicit legal principles extracted from judgments allows models to rank cited cases according to doctrinal relevance rather than factual overlap, as shown by consistent gains across baselines on the SG-LegalCite collection of Singapore Supreme Court decisions from 2000 to 2025.

What carries the argument

The principle-augmented retrieval paradigm, which builds queries from case facts plus the governing legal principle to rank precedents by doctrinal fit.

If this is right

Retrieval systems will prioritize precedents that share the same legal rule over those that merely share factual patterns.
Legal AI tools can more closely follow real-world reasoning by treating principles as the primary matching criterion.
In Singapore, the approach helps separate binding domestic authority from merely persuasive foreign references.
Performance gains observed across multiple baselines suggest the paradigm is not tied to one model family.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same principle-extraction step could be adapted to create comparable benchmarks in other common-law jurisdictions.
Benchmarks that keep principle and fact entangled may systematically overestimate model performance on doctrinally relevant tasks.
Future experiments could test whether principle-augmented retrieval reduces citation errors when models encounter mixed domestic and foreign authorities.

Load-bearing premise

The automatic or semi-automatic extraction process accurately isolates the single governing legal principle from each judgment without significant entanglement with surrounding facts or context.

What would settle it

If principle-augmented queries show no consistent improvement in retrieval metrics over fact-only queries when evaluated on held-out Singapore cases, the central claim would be falsified.

Figures

Figures reproduced from arXiv: 2605.21057 by Chloe Lee En Jia, Kaidong Feng, Shannon Lee Yueh Ern, Yingpeng Du, Zhu Sun.

**Figure 2.** Figure 2: SG-LegalCite dataset construction pipeline. [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗

**Figure 3.** Figure 3: Distribution of judgments by legal domain in [PITH_FULL_IMAGE:figures/full_fig_p017_3.png] view at source ↗

read the original abstract

Legal citation in common-law systems depends not only on factual similarity, but also on the legal principle for which a precedent is invoked. However, existing benchmarks for legal citation retrieval use case facts, citation context, or full judgments as inputs, where the governing legal principle is often missing or only implicitly expressed and entangled with broader context. As a result, models may retrieve precedents that are factually similar yet doctrinally irrelevant. This limitation is particularly consequential in Singapore, where the legal system has evolved independently: only domestic precedents are binding, while foreign authorities serve merely as persuasive references. Thus, we propose a new retrieval paradigm that ranks cited cases based on queries integrating case facts and explicit legal principles, inspired by real-world legal reasoning workflows. To support this paradigm, we introduce SG-LegalCite, a dataset of 100,890 case-principle pairs extracted from 8,523 Singapore Supreme Court judgments spanning from 2000 to 2025. Experiments across 11 baselines demonstrate the effectiveness of our principle-augmented retrieval paradigm, showing that explicit legal principles provide strong discriminative signals for legal citation retrieval.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

SG-LegalCite supplies a sizable Singapore-specific dataset that pairs facts with extracted legal principles for citation retrieval, but the extraction quality and empirical details remain lightly documented.

read the letter

The main thing to know is that this paper introduces SG-LegalCite, a dataset of roughly 100,000 case-principle pairs drawn from over 8,500 Singapore Supreme Court judgments. It tests a retrieval setup that feeds both case facts and explicit legal principles into the query, which aligns with how precedent actually works in a system that treats only domestic cases as binding. That framing is a clear step beyond benchmarks that rely on facts or full text alone. The scale and the jurisdiction focus are the parts that stand out as new. Running standard baselines against this setup apparently produces better separation when principles are added, which is the sort of result that could help people building practical legal search tools in Singapore or similar common-law settings. The motivation is straightforward and the problem is well scoped. The extraction process is the softer part. The abstract describes pulling principles from judgments but does not report human validation, agreement metrics, or any check that factual or contextual sentences have been stripped out. If the extracted principles still carry substantial factual material, the reported gains could reflect richer fact matching rather than cleaner doctrinal signals. The abstract also gives no concrete metrics, statistical tests, or error analysis, so the claim of strong discriminative signals is hard to weigh without the full tables. This paper is mainly for researchers in legal information retrieval who need jurisdiction-specific resources or who work on precedent-aware ranking. A reader already building tools for Asian or constrained common-law systems would find the dataset worth examining once the construction details are filled in. It is worth sending to a serious referee because the core idea is sound and the data volume is decent, even if the current write-up leaves the extraction step open to straightforward questions about fidelity.

Referee Report

1 major / 1 minor

Summary. The paper introduces SG-LegalCite, a dataset of 100,890 case-principle pairs extracted from 8,523 Singapore Supreme Court judgments (2000–2025), and proposes a principle-augmented retrieval paradigm for legal citation retrieval. It evaluates this paradigm against 11 baselines and claims that incorporating explicit legal principles yields strong discriminative signals beyond factual similarity alone, addressing limitations in existing benchmarks where principles are implicit or entangled with context.

Significance. If the extracted principles are shown to be cleanly isolated from factual and contextual material, the work would provide a valuable jurisdiction-specific resource for legal IR research, particularly for common-law systems like Singapore's where binding authority is limited to domestic precedents. The new dataset and paradigm could support more doctrinally accurate retrieval models and inspire similar principle-focused benchmarks in other jurisdictions.

major comments (1)

[Section 3] Section 3 (dataset construction): the extraction of the 100,890 case-principle pairs from 8,523 judgments is described but supplies no human-annotated fidelity metrics, inter-annotator agreement scores, or ablation studies that remove factual sentences to isolate doctrinal content. This is load-bearing for the central claim that 'explicit legal principles provide strong discriminative signals,' because without such validation the observed gains over baselines could arise from richer factual matching rather than principle matching.

minor comments (1)

[Abstract] The abstract asserts that experiments with 11 baselines demonstrate effectiveness but provides no quantitative metrics, statistical tests, or error analysis; these details should be summarized in the abstract for clarity.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the thoughtful and constructive review. The feedback on dataset validation is well-taken and directly relevant to the strength of our central claims. We address the major comment below and will incorporate revisions to strengthen the manuscript.

read point-by-point responses

Referee: [Section 3] Section 3 (dataset construction): the extraction of the 100,890 case-principle pairs from 8,523 judgments is described but supplies no human-annotated fidelity metrics, inter-annotator agreement scores, or ablation studies that remove factual sentences to isolate doctrinal content. This is load-bearing for the central claim that 'explicit legal principles provide strong discriminative signals,' because without such validation the observed gains over baselines could arise from richer factual matching rather than principle matching.

Authors: We agree that explicit validation of principle isolation is important to rule out confounding from factual content. The extraction pipeline in Section 3 combines sentence-level pattern matching for explicit principle statements (e.g., 'The principle established in ...') with targeted LLM prompting to separate doctrinal holdings from factual recitals, which are typically demarcated in Singapore judgments. Nevertheless, we acknowledge the absence of quantitative fidelity checks. In the revised manuscript we will add: (1) a human evaluation on a stratified sample of 300 pairs annotated by two Singapore-qualified lawyers, reporting Cohen's kappa for principle-vs-fact classification; (2) an ablation that strips factual sentences from the principle representations and re-runs the retrieval experiments to quantify the contribution of doctrinal content alone. These additions will appear in a new subsection of Section 3 and updated experimental results. revision: yes

Circularity Check

0 steps flagged

No circularity: dataset construction and baseline evaluation are independent of reported results

full rationale

The paper constructs SG-LegalCite by extracting 100,890 case-principle pairs from 8,523 judgments and then evaluates 11 standard retrieval baselines on principle-augmented queries versus fact-only or context-only inputs. No equations, fitted parameters, or self-citations appear in the provided text that would reduce the claimed discriminative gain to a tautology or input by construction. The extraction process is described as a one-time dataset creation step whose fidelity is not internally validated within the evaluation loop, and the baselines are off-the-shelf methods whose performance is measured externally. This leaves the central empirical claim self-contained against independent benchmarks rather than circular.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim depends on the assumption that legal principles can be reliably extracted as discrete, queryable units from judgments and that these units provide independent discriminative power beyond factual similarity.

axioms (1)

domain assumption Legal principles can be explicitly extracted from judgments and paired with cases without significant loss of doctrinal meaning.
Dataset construction and the principle-augmented paradigm rest on this extraction step.

pith-pipeline@v0.9.0 · 5738 in / 1249 out tokens · 47122 ms · 2026-05-21T01:58:11.457614+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We introduce SG-LegalCite, a dataset of 100,890 case–principle pairs extracted from 8,523 Singapore Supreme Court judgments... Experiments across 11 baselines demonstrate the effectiveness of our principle-augmented retrieval paradigm
IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

explicit legal principles provide strong discriminative signals for legal citation retrieval

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

62 extracted references · 62 canonical work pages · 2 internal anchors

[1]

ACM Transactions on Information Systems , volume=

Reinforced prompt personalization for recommendation with large language models , author=. ACM Transactions on Information Systems , volume=. 2025 , publisher=

work page 2025
[2]

DeepSeek-V3 Technical Report

Deepseek-v3 technical report , author=. arXiv preprint arXiv:2412.19437 , year=

work page internal anchor Pith review Pith/arXiv arXiv
[3]

Introducing GPT-4o , year =

work page
[4]

Introducing Claude 4 , year =

work page
[5]

arXiv preprint arXiv:2105.05686 , year=

Yes, bm25 is a strong baseline for legal case retrieval , author=. arXiv preprint arXiv:2105.05686 , year=

work page arXiv
[6]

The Singapore Legal System , editor =

Woon, Walter , title =. The Singapore Legal System , editor =. 1999 , publisher =

work page 1999
[7]

Findings of the Association for Computational Linguistics: EMNLP 2020 , pages =

Chalkidis, Ilias and Fergadiotis, Manos and Malakasiotis, Prodromos and Aletras, Nikolaos and Androutsopoulos, Ion , title =. Findings of the Association for Computational Linguistics: EMNLP 2020 , pages =. 2020 , publisher =

work page 2020
[8]

Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics , pages =

Chalkidis, Ilias and Garneau, Nicolas and Goanta, Catalina and Katz, Daniel and S. Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics , pages =. 2023 , address =

work page 2023
[9]

The Legal System of Singapore: Institutions, Principles and Practices , editor =

Goh, Yihan , title =. The Legal System of Singapore: Institutions, Principles and Practices , editor =. 2015 , pages =

work page 2015
[10]

The Legal System of Singapore: Institutions, Principles and Practices , editor =

Chan, Gary Kok Yew , title =. The Legal System of Singapore: Institutions, Principles and Practices , editor =. 2015 , chapter =

work page 2015
[11]

Practice Direction No 1 of 2008 , year =

work page 2008
[12]

Practice Statement (Judicial Precedent) , year =

work page
[13]

and Henderson, Peter and Ho, Daniel E

Zheng, Lucia and Guha, Neel and Anderson, Brandon R. and Henderson, Peter and Ho, Daniel E. , title =. Proceedings of the 18th International Conference on Artificial Intelligence and Law , pages =. 2021 , publisher =

work page 2021
[14]

Spandeck Engineering (S) Pte Ltd v Defence Science & Technology Agency , year =

work page
[15]

Dickman , author=

Case analysis: Caparo Industries Plc v. Dickman , author=. Dickman (July 5, 2015) , year=

work page 2015
[16]

The Review of Socionetwork Strategies , volume =

Goebel, Randy and Kano, Yoshinobu and Kim, Mi-Young and Rabelo, Juliano and Satoh, Ken and Yoshioka, Masaharu , title =. The Review of Socionetwork Strategies , volume =. 2024 , doi =

work page 2024
[17]

Advances in Neural Information Processing Systems , volume =

Pile of Law: Learning Responsible Data Filtering from the Law and a 256GB Open-Source Legal Dataset , author =. Advances in Neural Information Processing Systems , volume =. 2022 , publisher =

work page 2022
[18]

2023 , publisher =

Li, Haitao and Ai, Qingyao and Chen, Jia and Dong, Qian and Wu, Yueyue and Liu, Yiqun and Chen, Chong and Tian, Qi , booktitle =. 2023 , publisher =

work page 2023
[19]

Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval , year =

Fine-Tuning LLaMA for Multi-Stage Text Retrieval , author =. Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval , year =

work page
[20]

Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , year =

Improving Text Embeddings with Large Language Models , author =. Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , year =

work page
[21]

Tang, Yanran and Qiu, Ruihong and Li, Xue and Huang, Zi , journal =

work page
[22]

2024 , publisher =

Colombo, Pierre and Pires, Telmo Pessoa and Boudiaf, Malik and Melo, Rui and Culver, Dominic and Morgado, Sofia and Malaboeuf, Etienne and Hautreux, Gabriel and Charpentier, Johanne and Desa, Michael , booktitle =. 2024 , publisher =

work page 2024
[23]

Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics , pages =

Niklaus, Joel and Matoshi, Veton and St. Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics , pages =. 2024 , publisher =

work page 2024
[24]

ACM Transactions on Information Systems , volume =

Understanding Relevance Judgments in Legal Case Retrieval , author =. ACM Transactions on Information Systems , volume =. 2023 , publisher =

work page 2023
[25]

Artificial Intelligence and Law , volume =

Text retrieval in the legal world , author =. Artificial Intelligence and Law , volume =. 1995 , publisher =

work page 1995
[26]

2023 , address =

Joshi, Abhinav and Sharma, Akshat and Tanikella, Sai Kiran and Modi, Ashutosh , booktitle =. 2023 , address =

work page 2023
[27]

2023 , pages =

Li, Qingquan and Hu, Yiran and Yao, Feng and Xiao, Chaojun and Liu, Zhiyuan and Sun, Maosong and Shen, Weixing , booktitle =. 2023 , pages =

work page 2023
[28]

Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing , month = nov, year =

Enhancing Legal Case Retrieval via Scaling High-quality Synthetic Query-Candidate Pairs , author =. Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing , month = nov, year =. doi:10.18653/v1/2024.emnlp-main.402 , pages =

work page doi:10.18653/v1/2024.emnlp-main.402 2024
[29]

2021 , address =

Wrzalik, Marco and Krechel, Dirk , booktitle =. 2021 , address =. doi:10.18653/v1/2021.nllp-1.13 , pages =

work page doi:10.18653/v1/2021.nllp-1.13 2021
[30]

2021 , pages =

Ma, Yixiao and Shao, Yunqiu and Wu, Yueyue and Liu, Yiqun and Zhang, Ruizhe and Zhang, Min and Ma, Shaoping , booktitle =. 2021 , pages =

work page 2021
[31]

2024 , pages =

Li, Haitao and Shao, Yunqiu and Wu, Yueyue and Ai, Qingyao and Ma, Yixiao and Liu, Yiqun , booktitle =. 2024 , pages =

work page 2024
[32]

Overview of the

Mandal, Arpan and Ghosh, Kripabandhu and Bhattacharya, Arnab and Pal, Arindam and Ghosh, Saptarshi , booktitle =. Overview of the. 2017 , pages =

work page 2017
[33]

Overview of the

Bhattacharya, Paheli and Ghosh, Kripabandhu and Ghosh, Saptarshi and Pal, Arindam and Mehta, Parth and Bhattacharya, Arnab and Majumder, Prasenjit , booktitle =. Overview of the. 2019 , pages =

work page 2019
[34]

2024 , address =

T.y.s.s., Santosh and Haddad, Rashid and Grabmair, Matthias , booktitle =. 2024 , address =

work page 2024
[35]

2025 , pages =

Hou, Abe Bohan and Weller, Orion and Qin, Guanghui and Yang, Eugene and Lawrie, Dawn and Holzenberger, Nils and Blair-Stanek, Andrew and Van Durme, Benjamin , booktitle =. 2025 , pages =

work page 2025
[36]

2024 , month = aug, pages =

Mahari, Robert and Stammbach, Dominik and Ash, Elliott and Pentland, Alex , booktitle =. 2024 , month = aug, pages =

work page 2024
[37]

, title =

Sutton, Stuart A. , title =. Journal of the American Society for Information Science , volume =. 1994 , doi =

work page 1994
[38]

2004 , address=

Lin, Chin-Yew , booktitle=. 2004 , address=

work page 2004
[39]

and Artzi, Yoav , booktitle=

Zhang, Tianyi and Kishore, Varsha and Wu, Felix and Weinberger, Kilian Q. and Artzi, Yoav , booktitle=. 2020 , url=

work page 2020
[40]

2019 , month = jun, address =

Devlin, Jacob and Chang, Ming-Wei and Lee, Kenton and Toutanova, Kristina , booktitle =. 2019 , month = jun, address =

work page 2019
[41]

Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing , pages=

Dense Passage Retrieval for Open-Domain Question Answering , author=. Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing , pages=. 2020 , publisher=

work page 2020
[42]

Sentence-

Reimers, Nils and Gurevych, Iryna , booktitle =. Sentence-. 2019 , month = nov, address =

work page 2019
[43]

International Conference on Learning Representations , year=

Adapting Large Language Models via Reading Comprehension , author=. International Conference on Learning Representations , year=

work page
[44]

Yao, Shunyu and Ke, Qingqing and Wang, Qiwei and Li, Kangtong and Hu, Jie , booktitle=. Lawyer. 2024 , organization=

work page 2024
[45]

Fei, Zhiwei and Zhang, Songyang and Shen, Xiaoyu and Zhu, Dawei and Wang, Xiao and Ge, Jidong and Ng, Vincent , booktitle=. Intern. 2025 , organization=

work page 2025
[46]

2024 , eprint =

Lawma: The Power of Specialization for Legal Tasks , author =. 2024 , eprint =

work page 2024
[47]

Representation Learning with Contrastive Predictive Coding

Representation Learning with Contrastive Predictive Coding , author=. arXiv preprint arXiv:1807.03748 , year=

work page internal anchor Pith review Pith/arXiv arXiv
[48]

Proceedings of the 38th International Conference on Machine Learning , pages=

Learning Transferable Visual Models From Natural Language Supervision , author=. Proceedings of the 38th International Conference on Machine Learning , pages=. 2021 , volume=

work page 2021
[49]

Proceedings of the 29th International Joint Conference on Artificial Intelligence , pages =

Shao, Yunqiu and Mao, Jiaxin and Liu, Yiqun and Ma, Weizhi and Satoh, Ken and Zhang, Min and Ma, Shaoping , title =. Proceedings of the 29th International Joint Conference on Artificial Intelligence , pages =. 2020 , publisher =

work page 2020
[50]

Althammer, Sophia and Askari, Arian and Verberne, Suzan and Hanbury, Allan , title =. Proceedings of the 8th International Competition on Legal Information Extraction/Entailment, in association with the 18th International Conference on Artificial Intelligence and Law (

work page
[51]

Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing , pages =

Ethayarajh, Kawin , title =. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing , pages =. 2019 , address =

work page 2019
[52]

Lee , title =

He, Qi and Pei, Jian and Kifer, Daniel and Mitra, Prasenjit and Giles, C. Lee , title =. Proceedings of the 19th International Conference on World Wide Web , pages =. 2010 , publisher =

work page 2010
[53]

Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies , pages =

Bhagavatula, Chandra and Feldman, Sergey and Power, Russell and Ammar, Waleed , title =. Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies , pages =. 2018 , publisher =

work page 2018
[54]

Proceedings of the 40th International

Ebesu, Travis and Fang, Yi , title =. Proceedings of the 40th International. 2017 , publisher =

work page 2017
[55]

Scientometrics , volume =

Jeong, Chanwoo and Jang, Sion and Park, Eunjeong and Choi, Sungchul , title =. Scientometrics , volume =. 2020 , doi =

work page 2020
[56]

Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics , pages =

Cohan, Arman and Feldman, Sergey and Beltagy, Iz and Downey, Doug and Weld, Daniel , title =. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics , pages =. 2020 , publisher =

work page 2020
[57]

Citation Recommendation: Approaches and Datasets , journal =

F. Citation Recommendation: Approaches and Datasets , journal =. 2020 , doi =

work page 2020
[58]

and Krass, Mark S

Huang, Zihan and Low, Charles and Teng, Mengqiu and Zhang, Hongyi and Ho, Daniel E. and Krass, Mark S. and Grabmair, Matthias , title =. Proceedings of the 18th International Conference on Artificial Intelligence and Law , pages =. 2021 , publisher =

work page 2021
[59]

Findings of the Association for Computational Linguistics:

Luo, Chu Fei and Bhambhoria, Rohan and Dahan, Samuel and Zhu, Xiaodan , title =. Findings of the Association for Computational Linguistics:. 2023 , publisher =

work page 2023
[60]

, title =

Wang, Jie and Bansal, Kanha and Arapakis, Ioannis and Ge, Xuri and Jose, Joemon M. , title =. Proceedings of the 46th European Conference on Information Retrieval , pages =. 2024 , publisher =

work page 2024
[61]

Proceedings of the 36th International Conference on Database and Expert Systems Applications , pages =

Wendlinger, Lorenz and Nonn, Simon Alexander and Al Zubaer, Abdullah and Granitzer, Michael , title =. Proceedings of the 36th International Conference on Database and Expert Systems Applications , pages =. 2025 , publisher =

work page 2025
[62]

Singapore Law Watch , howpublished =

work page

[1] [1]

ACM Transactions on Information Systems , volume=

Reinforced prompt personalization for recommendation with large language models , author=. ACM Transactions on Information Systems , volume=. 2025 , publisher=

work page 2025

[2] [2]

DeepSeek-V3 Technical Report

Deepseek-v3 technical report , author=. arXiv preprint arXiv:2412.19437 , year=

work page internal anchor Pith review Pith/arXiv arXiv

[3] [3]

Introducing GPT-4o , year =

work page

[4] [4]

Introducing Claude 4 , year =

work page

[5] [5]

arXiv preprint arXiv:2105.05686 , year=

Yes, bm25 is a strong baseline for legal case retrieval , author=. arXiv preprint arXiv:2105.05686 , year=

work page arXiv

[6] [6]

The Singapore Legal System , editor =

Woon, Walter , title =. The Singapore Legal System , editor =. 1999 , publisher =

work page 1999

[7] [7]

Findings of the Association for Computational Linguistics: EMNLP 2020 , pages =

Chalkidis, Ilias and Fergadiotis, Manos and Malakasiotis, Prodromos and Aletras, Nikolaos and Androutsopoulos, Ion , title =. Findings of the Association for Computational Linguistics: EMNLP 2020 , pages =. 2020 , publisher =

work page 2020

[8] [8]

Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics , pages =

Chalkidis, Ilias and Garneau, Nicolas and Goanta, Catalina and Katz, Daniel and S. Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics , pages =. 2023 , address =

work page 2023

[9] [9]

The Legal System of Singapore: Institutions, Principles and Practices , editor =

Goh, Yihan , title =. The Legal System of Singapore: Institutions, Principles and Practices , editor =. 2015 , pages =

work page 2015

[10] [10]

The Legal System of Singapore: Institutions, Principles and Practices , editor =

Chan, Gary Kok Yew , title =. The Legal System of Singapore: Institutions, Principles and Practices , editor =. 2015 , chapter =

work page 2015

[11] [11]

Practice Direction No 1 of 2008 , year =

work page 2008

[12] [12]

Practice Statement (Judicial Precedent) , year =

work page

[13] [13]

and Henderson, Peter and Ho, Daniel E

Zheng, Lucia and Guha, Neel and Anderson, Brandon R. and Henderson, Peter and Ho, Daniel E. , title =. Proceedings of the 18th International Conference on Artificial Intelligence and Law , pages =. 2021 , publisher =

work page 2021

[14] [14]

Spandeck Engineering (S) Pte Ltd v Defence Science & Technology Agency , year =

work page

[15] [15]

Dickman , author=

Case analysis: Caparo Industries Plc v. Dickman , author=. Dickman (July 5, 2015) , year=

work page 2015

[16] [16]

The Review of Socionetwork Strategies , volume =

Goebel, Randy and Kano, Yoshinobu and Kim, Mi-Young and Rabelo, Juliano and Satoh, Ken and Yoshioka, Masaharu , title =. The Review of Socionetwork Strategies , volume =. 2024 , doi =

work page 2024

[17] [17]

Advances in Neural Information Processing Systems , volume =

Pile of Law: Learning Responsible Data Filtering from the Law and a 256GB Open-Source Legal Dataset , author =. Advances in Neural Information Processing Systems , volume =. 2022 , publisher =

work page 2022

[18] [18]

2023 , publisher =

Li, Haitao and Ai, Qingyao and Chen, Jia and Dong, Qian and Wu, Yueyue and Liu, Yiqun and Chen, Chong and Tian, Qi , booktitle =. 2023 , publisher =

work page 2023

[19] [19]

Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval , year =

Fine-Tuning LLaMA for Multi-Stage Text Retrieval , author =. Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval , year =

work page

[20] [20]

Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , year =

Improving Text Embeddings with Large Language Models , author =. Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , year =

work page

[21] [21]

Tang, Yanran and Qiu, Ruihong and Li, Xue and Huang, Zi , journal =

work page

[22] [22]

2024 , publisher =

Colombo, Pierre and Pires, Telmo Pessoa and Boudiaf, Malik and Melo, Rui and Culver, Dominic and Morgado, Sofia and Malaboeuf, Etienne and Hautreux, Gabriel and Charpentier, Johanne and Desa, Michael , booktitle =. 2024 , publisher =

work page 2024

[23] [23]

Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics , pages =

Niklaus, Joel and Matoshi, Veton and St. Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics , pages =. 2024 , publisher =

work page 2024

[24] [24]

ACM Transactions on Information Systems , volume =

Understanding Relevance Judgments in Legal Case Retrieval , author =. ACM Transactions on Information Systems , volume =. 2023 , publisher =

work page 2023

[25] [25]

Artificial Intelligence and Law , volume =

Text retrieval in the legal world , author =. Artificial Intelligence and Law , volume =. 1995 , publisher =

work page 1995

[26] [26]

2023 , address =

Joshi, Abhinav and Sharma, Akshat and Tanikella, Sai Kiran and Modi, Ashutosh , booktitle =. 2023 , address =

work page 2023

[27] [27]

2023 , pages =

Li, Qingquan and Hu, Yiran and Yao, Feng and Xiao, Chaojun and Liu, Zhiyuan and Sun, Maosong and Shen, Weixing , booktitle =. 2023 , pages =

work page 2023

[28] [28]

Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing , month = nov, year =

Enhancing Legal Case Retrieval via Scaling High-quality Synthetic Query-Candidate Pairs , author =. Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing , month = nov, year =. doi:10.18653/v1/2024.emnlp-main.402 , pages =

work page doi:10.18653/v1/2024.emnlp-main.402 2024

[29] [29]

2021 , address =

Wrzalik, Marco and Krechel, Dirk , booktitle =. 2021 , address =. doi:10.18653/v1/2021.nllp-1.13 , pages =

work page doi:10.18653/v1/2021.nllp-1.13 2021

[30] [30]

2021 , pages =

Ma, Yixiao and Shao, Yunqiu and Wu, Yueyue and Liu, Yiqun and Zhang, Ruizhe and Zhang, Min and Ma, Shaoping , booktitle =. 2021 , pages =

work page 2021

[31] [31]

2024 , pages =

Li, Haitao and Shao, Yunqiu and Wu, Yueyue and Ai, Qingyao and Ma, Yixiao and Liu, Yiqun , booktitle =. 2024 , pages =

work page 2024

[32] [32]

Overview of the

Mandal, Arpan and Ghosh, Kripabandhu and Bhattacharya, Arnab and Pal, Arindam and Ghosh, Saptarshi , booktitle =. Overview of the. 2017 , pages =

work page 2017

[33] [33]

Overview of the

Bhattacharya, Paheli and Ghosh, Kripabandhu and Ghosh, Saptarshi and Pal, Arindam and Mehta, Parth and Bhattacharya, Arnab and Majumder, Prasenjit , booktitle =. Overview of the. 2019 , pages =

work page 2019

[34] [34]

2024 , address =

T.y.s.s., Santosh and Haddad, Rashid and Grabmair, Matthias , booktitle =. 2024 , address =

work page 2024

[35] [35]

2025 , pages =

Hou, Abe Bohan and Weller, Orion and Qin, Guanghui and Yang, Eugene and Lawrie, Dawn and Holzenberger, Nils and Blair-Stanek, Andrew and Van Durme, Benjamin , booktitle =. 2025 , pages =

work page 2025

[36] [36]

2024 , month = aug, pages =

Mahari, Robert and Stammbach, Dominik and Ash, Elliott and Pentland, Alex , booktitle =. 2024 , month = aug, pages =

work page 2024

[37] [37]

, title =

Sutton, Stuart A. , title =. Journal of the American Society for Information Science , volume =. 1994 , doi =

work page 1994

[38] [38]

2004 , address=

Lin, Chin-Yew , booktitle=. 2004 , address=

work page 2004

[39] [39]

and Artzi, Yoav , booktitle=

Zhang, Tianyi and Kishore, Varsha and Wu, Felix and Weinberger, Kilian Q. and Artzi, Yoav , booktitle=. 2020 , url=

work page 2020

[40] [40]

2019 , month = jun, address =

Devlin, Jacob and Chang, Ming-Wei and Lee, Kenton and Toutanova, Kristina , booktitle =. 2019 , month = jun, address =

work page 2019

[41] [41]

Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing , pages=

Dense Passage Retrieval for Open-Domain Question Answering , author=. Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing , pages=. 2020 , publisher=

work page 2020

[42] [42]

Sentence-

Reimers, Nils and Gurevych, Iryna , booktitle =. Sentence-. 2019 , month = nov, address =

work page 2019

[43] [43]

International Conference on Learning Representations , year=

Adapting Large Language Models via Reading Comprehension , author=. International Conference on Learning Representations , year=

work page

[44] [44]

Yao, Shunyu and Ke, Qingqing and Wang, Qiwei and Li, Kangtong and Hu, Jie , booktitle=. Lawyer. 2024 , organization=

work page 2024

[45] [45]

Fei, Zhiwei and Zhang, Songyang and Shen, Xiaoyu and Zhu, Dawei and Wang, Xiao and Ge, Jidong and Ng, Vincent , booktitle=. Intern. 2025 , organization=

work page 2025

[46] [46]

2024 , eprint =

Lawma: The Power of Specialization for Legal Tasks , author =. 2024 , eprint =

work page 2024

[47] [47]

Representation Learning with Contrastive Predictive Coding

Representation Learning with Contrastive Predictive Coding , author=. arXiv preprint arXiv:1807.03748 , year=

work page internal anchor Pith review Pith/arXiv arXiv

[48] [48]

Proceedings of the 38th International Conference on Machine Learning , pages=

Learning Transferable Visual Models From Natural Language Supervision , author=. Proceedings of the 38th International Conference on Machine Learning , pages=. 2021 , volume=

work page 2021

[49] [49]

Proceedings of the 29th International Joint Conference on Artificial Intelligence , pages =

Shao, Yunqiu and Mao, Jiaxin and Liu, Yiqun and Ma, Weizhi and Satoh, Ken and Zhang, Min and Ma, Shaoping , title =. Proceedings of the 29th International Joint Conference on Artificial Intelligence , pages =. 2020 , publisher =

work page 2020

[50] [50]

Althammer, Sophia and Askari, Arian and Verberne, Suzan and Hanbury, Allan , title =. Proceedings of the 8th International Competition on Legal Information Extraction/Entailment, in association with the 18th International Conference on Artificial Intelligence and Law (

work page

[51] [51]

Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing , pages =

Ethayarajh, Kawin , title =. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing , pages =. 2019 , address =

work page 2019

[52] [52]

Lee , title =

He, Qi and Pei, Jian and Kifer, Daniel and Mitra, Prasenjit and Giles, C. Lee , title =. Proceedings of the 19th International Conference on World Wide Web , pages =. 2010 , publisher =

work page 2010

[53] [53]

Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies , pages =

Bhagavatula, Chandra and Feldman, Sergey and Power, Russell and Ammar, Waleed , title =. Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies , pages =. 2018 , publisher =

work page 2018

[54] [54]

Proceedings of the 40th International

Ebesu, Travis and Fang, Yi , title =. Proceedings of the 40th International. 2017 , publisher =

work page 2017

[55] [55]

Scientometrics , volume =

Jeong, Chanwoo and Jang, Sion and Park, Eunjeong and Choi, Sungchul , title =. Scientometrics , volume =. 2020 , doi =

work page 2020

[56] [56]

Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics , pages =

Cohan, Arman and Feldman, Sergey and Beltagy, Iz and Downey, Doug and Weld, Daniel , title =. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics , pages =. 2020 , publisher =

work page 2020

[57] [57]

Citation Recommendation: Approaches and Datasets , journal =

F. Citation Recommendation: Approaches and Datasets , journal =. 2020 , doi =

work page 2020

[58] [58]

and Krass, Mark S

Huang, Zihan and Low, Charles and Teng, Mengqiu and Zhang, Hongyi and Ho, Daniel E. and Krass, Mark S. and Grabmair, Matthias , title =. Proceedings of the 18th International Conference on Artificial Intelligence and Law , pages =. 2021 , publisher =

work page 2021

[59] [59]

Findings of the Association for Computational Linguistics:

Luo, Chu Fei and Bhambhoria, Rohan and Dahan, Samuel and Zhu, Xiaodan , title =. Findings of the Association for Computational Linguistics:. 2023 , publisher =

work page 2023

[60] [60]

, title =

Wang, Jie and Bansal, Kanha and Arapakis, Ioannis and Ge, Xuri and Jose, Joemon M. , title =. Proceedings of the 46th European Conference on Information Retrieval , pages =. 2024 , publisher =

work page 2024

[61] [61]

Proceedings of the 36th International Conference on Database and Expert Systems Applications , pages =

Wendlinger, Lorenz and Nonn, Simon Alexander and Al Zubaer, Abdullah and Granitzer, Michael , title =. Proceedings of the 36th International Conference on Database and Expert Systems Applications , pages =. 2025 , publisher =

work page 2025

[62] [62]

Singapore Law Watch , howpublished =

work page