HAKARI-Bench reconstructs 35 benchmarks into 551 tasks across 43 languages, reproducing full MTEB, MMTEB, and BEIR rankings with Spearman correlation above 0.97 while supporting efficiency variant comparisons.
arXiv preprint arXiv:2310.17609 , year=
4 Pith papers cite this work. Polarity classification is still indexing.
verdicts
UNVERDICTED 4representative citing papers
M3-Embedding is a single model for multi-lingual, multi-functional, and multi-granular text embeddings trained via self-knowledge distillation that achieves new state-of-the-art results on multilingual, cross-lingual, and long-document retrieval benchmarks.
The work introduces a should-change/should-not-change evaluation suite for legal LLMs and the LexGuard adversarial framework that uses SMT solvers to enforce legal consistency.
GLIER reformulates legal case retrieval as generative inference over latent legal variables like charges and elements, then fuses generative, structural, and lexical signals, outperforming baselines on LeCaRD datasets with strong performance at 10% training data.
citing papers explorer
-
Which Changes Matter? Towards Trustworthy Legal AI via Relevance-Sensitive Evaluation and Solver-Grounded Reasoning
The work introduces a should-change/should-not-change evaluation suite for legal LLMs and the LexGuard adversarial framework that uses SMT solvers to enforce legal consistency.