ORFuzz presents the first evolutionary testing framework for LLM over-refusal together with a new benchmark of 1,855 cases that triggers over-refusal at 63.56% average across ten models.
Drhouse: An llm-empowered diagnostic reasoning system through harnessing outcomes from sensor data and expert knowledge,
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.SE 1years
2025 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
ORFuzz: Fuzzing the "Other Side" of LLM Safety -- Testing Over-Refusal
ORFuzz presents the first evolutionary testing framework for LLM over-refusal together with a new benchmark of 1,855 cases that triggers over-refusal at 63.56% average across ten models.