arXiv preprint arXiv:2103.06268 , year=

· 2021 · arXiv 2103.06268

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

read on arXiv browse 4 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

LegalCiteBench: Evaluating Citation Reliability in Legal Language Models

cs.CL · 2026-05-11 · unverdicted · novelty 6.0

LegalCiteBench reveals that current LLMs achieve under 7% accuracy on closed-book legal citation retrieval and completion tasks, with misleading answer rates above 94% for nearly all tested models.

Retrieval-Based Multi-Label Legal Annotation: Extensible, Data-Efficient and Hallucination-Free

cs.CL · 2026-05-16 · unverdicted · novelty 5.0

Retrieval with frozen embeddings and k-NN delivers competitive accuracy, high data efficiency, and zero hallucinations on legal multi-label annotation across ECtHR and Eurlex datasets.

A Few Good Clauses: Comparing LLMs vs Domain-Trained Small Language Models on Structured Contract Extraction

cs.CL · 2026-05-07 · unverdicted · novelty 5.0

Domain-trained small language model Olava Extract outperforms frontier LLMs on structured contract extraction with macro F1 0.812, micro F1 0.842, highest precision, and 78-97% lower inference cost.

A Benchmark for Gap and Overlap Analysis as a Test of KG Task Readiness

cs.AI · 2026-04-12 · unverdicted · novelty 5.0

The paper releases a benchmark of ten life-insurance contracts, a domain ontology, and 58 evidence-linked scenarios that shows ontology-driven knowledge graph queries produce more consistent and diagnosable gap/overlap results than text-only LLM inference.

citing papers explorer

Showing 4 of 4 citing papers.

LegalCiteBench: Evaluating Citation Reliability in Legal Language Models cs.CL · 2026-05-11 · unverdicted · none · ref 4
LegalCiteBench reveals that current LLMs achieve under 7% accuracy on closed-book legal citation retrieval and completion tasks, with misleading answer rates above 94% for nearly all tested models.
Retrieval-Based Multi-Label Legal Annotation: Extensible, Data-Efficient and Hallucination-Free cs.CL · 2026-05-16 · unverdicted · none · ref 15
Retrieval with frozen embeddings and k-NN delivers competitive accuracy, high data efficiency, and zero hallucinations on legal multi-label annotation across ECtHR and Eurlex datasets.
A Few Good Clauses: Comparing LLMs vs Domain-Trained Small Language Models on Structured Contract Extraction cs.CL · 2026-05-07 · unverdicted · none · ref 5
Domain-trained small language model Olava Extract outperforms frontier LLMs on structured contract extraction with macro F1 0.812, micro F1 0.842, highest precision, and 78-97% lower inference cost.
A Benchmark for Gap and Overlap Analysis as a Test of KG Task Readiness cs.AI · 2026-04-12 · unverdicted · none · ref 11
The paper releases a benchmark of ten life-insurance contracts, a domain ontology, and 58 evidence-linked scenarios that shows ontology-driven knowledge graph queries produce more consistent and diagnosable gap/overlap results than text-only LLM inference.

arXiv preprint arXiv:2103.06268 , year=

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer