BART-large outperforms Mistral-7B in AI-to-human style transfer with higher reference similarity scores and far fewer parameters, while showing that marker shift can reflect overshoot rather than accurate transfer.
Language models are unsupervised multitask learners.OpenAI Blog
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.CL 2years
2026 2representative citing papers
DExperts reaches 100% safety on explicit toxicity benchmarks but only 98.5% on implicit hate speech from ToxiGen while imposing a 10x latency increase on GPT-2.
citing papers explorer
-
Please Make it Sound like Human: Encoder-Decoder vs. Decoder-Only Transformers for AI-to-Human Text Style Transfer
BART-large outperforms Mistral-7B in AI-to-human style transfer with higher reference similarity scores and far fewer parameters, while showing that marker shift can reflect overshoot rather than accurate transfer.
-
Measuring and Mitigating Toxicity in Large Language Models: A Comprehensive Replication Study
DExperts reaches 100% safety on explicit toxicity benchmarks but only 98.5% on implicit hate speech from ToxiGen while imposing a 10x latency increase on GPT-2.