For example, making sexist remarks in a dis- cussion unrelated to gender issues or using cultural stereotypes to attack someone’s credibility

Stereotyping (Identity Targeting): Using stereotypes or demographic-based insults to undermine or provoke others based on their identity such as race, gender, religion

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

Beyond Static Benchmarks: Synthesizing Harmful Content via Persona-based Simulation for Robust Evaluation

cs.CL · 2026-04-18 · unverdicted · novelty 5.0

A two-dimensional persona simulation framework generates harmful content that is more challenging to detect and comparably diverse to human-curated datasets for robust evaluation of detection systems.

citing papers explorer

Showing 1 of 1 citing paper.

Beyond Static Benchmarks: Synthesizing Harmful Content via Persona-based Simulation for Robust Evaluation cs.CL · 2026-04-18 · unverdicted · none · ref 12
A two-dimensional persona simulation framework generates harmful content that is more challenging to detect and comparably diverse to human-curated datasets for robust evaluation of detection systems.

For example, making sexist remarks in a dis- cussion unrelated to gender issues or using cultural stereotypes to attack someone’s credibility

fields

years

verdicts

representative citing papers

citing papers explorer