The authors introduce a three-level formality spectrum (informal, casual, formal) and the 3LF dataset to correct supervision misalignment in formality transfer, reporting large gains in informal-to-formal performance on models including GPT variants.
Can LLM s Recognize Toxicity? A Structured Investigation Framework and Toxicity Metric
3 Pith papers cite this work. Polarity classification is still indexing.
years
2026 3verdicts
UNVERDICTED 3representative citing papers
LLM safety judges resist adjusting evaluations when given contradictory context or new safety definitions, despite some ability to learn from new information.
A survey that catalogs threat models, detection approaches, and mitigation strategies for toxicity in multilingual LLMs while identifying challenges such as uneven language coverage and culturally variable harm definitions.
citing papers explorer
-
Casual as an Anchor: Resolving Supervision Misalignment in Formality Transfer Dataset
The authors introduce a three-level formality spectrum (informal, casual, formal) and the 3LF dataset to correct supervision misalignment in formality transfer, reporting large gains in informal-to-formal performance on models including GPT variants.
-
A Survey of Toxicity Detection and Mitigation Strategies for Multilingual Language Models
A survey that catalogs threat models, detection approaches, and mitigation strategies for toxicity in multilingual LLMs while identifying challenges such as uneven language coverage and culturally variable harm definitions.