XSTest is a benchmark for detecting exaggerated safety refusals in large language models on clearly safe prompts.
Cohn, Nigel Shadbolt, and Michael Wooldridge
3 Pith papers cite this work. Polarity classification is still indexing.
representative citing papers
StarCoder2-15B matches or beats CodeLlama-34B on code tasks despite being smaller, and StarCoder2-3B outperforms prior 15B models, with open weights and exact training data identifiers released.
An end-to-end LLM framework refines natural language into valid PDDL domains and problems via hardcoded and dynamic agents, generates plans with standard engines, and returns readable output.
citing papers explorer
-
XSTest: A Test Suite for Identifying Exaggerated Safety Behaviours in Large Language Models
XSTest is a benchmark for detecting exaggerated safety refusals in large language models on clearly safe prompts.
-
StarCoder 2 and The Stack v2: The Next Generation
StarCoder2-15B matches or beats CodeLlama-34B on code tasks despite being smaller, and StarCoder2-3B outperforms prior 15B models, with open weights and exact training data identifiers released.
-
End-to-end PDDL Planning with Hardcoded and Dynamic Agents
An end-to-end LLM framework refines natural language into valid PDDL domains and problems via hardcoded and dynamic agents, generates plans with standard engines, and returns readable output.