MÖVE presents a new German-language benchmark evaluating 39 LLMs on performance and governance criteria using ten public-administration datasets.
Detecting linguistic bias in government documents using large language models
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2verdicts
UNVERDICTED 2representative citing papers
ToolRec introduces dual-level calibration of click data and weighted KTO alignment to improve tool-invoking query recommendations in on-device assistants, reporting CTR gains in large-scale A/B tests.
citing papers explorer
-
M\"OVE: A Holistic LLM Benchmark for the German Public Sector
MÖVE presents a new German-language benchmark evaluating 39 LLMs on performance and governance criteria using ten public-administration datasets.