Mandy Guo, Zihang Dai, Denny Vrandeˇci´c, and Rami Al-Rfou

URL https://arxiv · 2021 · arXiv 2109.00904

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

representative citing papers

Retrieval-Based Multi-Label Legal Annotation: Extensible, Data-Efficient and Hallucination-Free

cs.CL · 2026-05-16 · unverdicted · novelty 5.0

Retrieval with frozen embeddings and k-NN delivers competitive accuracy, high data efficiency, and zero hallucinations on legal multi-label annotation across ECtHR and Eurlex datasets.

ImmigrationQA: A Source-Grounded Dataset and Small-Model Adaptation for U.S. Immigration Law

cs.CL · 2026-05-28 · unverdicted · novelty 4.0

A new source-grounded QA dataset for U.S. immigration law is built from official documents and used to fine-tune a 3B model, yielding a 27% mean score improvement over the base model on a held-out sample.

Mimir: Large-scale Multilingual Concept Modeling

cs.CL · 2026-05-24 · unverdicted · novelty 4.0

Mimir is a 1.6B multilingual concept model pretrained on 38.9 billion sentences across 46 languages and instruction-tuned on 66.8 million sentences across 35 languages, then compared to a token-based LM of similar size.

citing papers explorer

Showing 3 of 3 citing papers after filters.

Retrieval-Based Multi-Label Legal Annotation: Extensible, Data-Efficient and Hallucination-Free cs.CL · 2026-05-16 · unverdicted · none · ref 6
Retrieval with frozen embeddings and k-NN delivers competitive accuracy, high data efficiency, and zero hallucinations on legal multi-label annotation across ECtHR and Eurlex datasets.
ImmigrationQA: A Source-Grounded Dataset and Small-Model Adaptation for U.S. Immigration Law cs.CL · 2026-05-28 · unverdicted · none · ref 2
A new source-grounded QA dataset for U.S. immigration law is built from official documents and used to fine-tune a 3B model, yielding a 27% mean score improvement over the base model on a held-out sample.
Mimir: Large-scale Multilingual Concept Modeling cs.CL · 2026-05-24 · unverdicted · none · ref 25
Mimir is a 1.6B multilingual concept model pretrained on 38.9 billion sentences across 46 languages and instruction-tuned on 66.8 million sentences across 35 languages, then compared to a token-based LM of similar size.

Mandy Guo, Zihang Dai, Denny Vrandeˇci´c, and Rami Al-Rfou

fields

years

verdicts

representative citing papers

citing papers explorer