REDACT is a new systematically controlled multilingual PII detection benchmark with 51 entity types, sensitivity-tier metadata, and stratified evaluation revealing that rule-based detectors fail on high-stakes data while LLM detectors are more robust.
and Nishitha, Sanka Nithya Tanvy and Mukherji, Abhishek , year =
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CL 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
REDACT: A Systematically Controlled Multilingual Benchmark for Personal Information Detection
REDACT is a new systematically controlled multilingual PII detection benchmark with 51 entity types, sensitivity-tier metadata, and stratified evaluation revealing that rule-based detectors fail on high-stakes data while LLM detectors are more robust.