Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing , pages=

ReadMe++: Benchmarking Multilingual Language Models for Multi-Domain Readability Assessment , author= · 2024 · DOI 10.18653/v1/2024.emnlp-main.682

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

open at publisher browse 2 citing papers

representative citing papers

ComplexityMT: Benchmarking the Interaction Between Text Complexity and Machine Translation

cs.CL · 2026-06-03 · unverdicted · novelty 6.0

ComplexityMT benchmark finds higher CEFR levels increase translation difficulty and MT systems often shift target CEFR levels versus source texts in most of six languages tested.

Can I Take Another Dose? Evaluating LLM Decision-Making Under Temporal Uncertainty in OTC Dosing QA

cs.CL · 2026-06-02 · unverdicted · novelty 6.0

Introduces DOSEBENCH benchmark and shows four LLMs often fail at rolling 24-hour dose calculations and constraint adherence in OTC dosing decisions despite appearing confident.

citing papers explorer

Showing 2 of 2 citing papers.

ComplexityMT: Benchmarking the Interaction Between Text Complexity and Machine Translation cs.CL · 2026-06-03 · unverdicted · none · ref 34
ComplexityMT benchmark finds higher CEFR levels increase translation difficulty and MT systems often shift target CEFR levels versus source texts in most of six languages tested.
Can I Take Another Dose? Evaluating LLM Decision-Making Under Temporal Uncertainty in OTC Dosing QA cs.CL · 2026-06-02 · unverdicted · none · ref 33
Introduces DOSEBENCH benchmark and shows four LLMs often fail at rolling 24-hour dose calculations and constraint adherence in OTC dosing decisions despite appearing confident.

Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing , pages=

fields

years

verdicts

representative citing papers

citing papers explorer