pith. sign in

Figure 8: Examples of the retrieval results of templated responses for detoxified model

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

fields

cs.CL 1

years

2026 1

verdicts

UNVERDICTED 1

representative citing papers

Detoxification for LLM: From Dataset Itself

cs.CL · 2026-04-21 · unverdicted · novelty 6.0

HSPD detoxifies pretraining corpora via hierarchical semantic-preserving rewriting with Soft Contrastive Decoding, cutting toxicity probability from 0.42 to 0.18 and expected maximum toxicity from 0.43 to 0.20 on GPT2-XL with consistent gains on other models.

citing papers explorer

Showing 1 of 1 citing paper.

  • Detoxification for LLM: From Dataset Itself cs.CL · 2026-04-21 · unverdicted · none · ref 13

    HSPD detoxifies pretraining corpora via hierarchical semantic-preserving rewriting with Soft Contrastive Decoding, cutting toxicity probability from 0.42 to 0.18 and expected maximum toxicity from 0.43 to 0.20 on GPT2-XL with consistent gains on other models.