Magis-Bench is a new benchmark of 74 magistrate-level legal writing tasks from Brazilian exams where the strongest LLMs reach only 6.97/10, showing judicial reasoning remains difficult for current models.
Title resolution pending
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CL 1years
2026 1verdicts
ACCEPT 1representative citing papers
citing papers explorer
-
Magis-Bench: Evaluating LLMs on Magistrate-Level Legal Tasks
Magis-Bench is a new benchmark of 74 magistrate-level legal writing tasks from Brazilian exams where the strongest LLMs reach only 6.97/10, showing judicial reasoning remains difficult for current models.