Mikail Demir, Hakan T

· 2025 · arXiv 2501.10915

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

representative citing papers

Validate Your Authority: Benchmarking LLMs on Multi-Label Precedent Treatment Classification

cs.CL · 2026-05-17 · unverdicted · novelty 7.0

Introduces a new dataset and Average Severity Error metric for benchmarking LLMs on multi-label legal precedent treatment classification.

Can Large Language Models Really Recognize Your Name?

cs.CR · 2025-05-20 · unverdicted · novelty 6.0

LLMs exhibit 20-40% lower recall on ambiguous human names for PII detection, worsening under prompt injections, as shown via the new AmBench benchmark.

citing papers explorer

Showing 2 of 2 citing papers.

Validate Your Authority: Benchmarking LLMs on Multi-Label Precedent Treatment Classification cs.CL · 2026-05-17 · unverdicted · none · ref 14
Introduces a new dataset and Average Severity Error metric for benchmarking LLMs on multi-label legal precedent treatment classification.
Can Large Language Models Really Recognize Your Name? cs.CR · 2025-05-20 · unverdicted · none · ref 16
LLMs exhibit 20-40% lower recall on ambiguous human names for PII detection, worsening under prompt injections, as shown via the new AmBench benchmark.

Mikail Demir, Hakan T

fields

years

verdicts

representative citing papers

citing papers explorer