"President Vows to Cut <Taxes> Hair": Dataset and Analysis of Creative Text Editing for Humorous Headlines

Nabil Hossain , John Krumm , Michael Gamon

Authors on Pith no claims yet

classification 💻 cs.CL

keywords headlineseditedfunnyhumorhumorousanalysisdatadataset

read the original abstract

We introduce, release, and analyze a new dataset, called Humicroedit, for research in computational humor. Our publicly available data consists of regular English news headlines paired with versions of the same headlines that contain simple replacement edits designed to make them funny. We carefully curated crowdsourced editors to create funny headlines and judges to score a to a total of 15,095 edited headlines, with five judges per headline. The simple edits, usually just a single word replacement, mean we can apply straightforward analysis techniques to determine what makes our edited headlines humorous. We show how the data support classic theories of humor, such as incongruity, superiority, and setup/punchline. Finally, we develop baseline classifiers that can predict whether or not an edited headline is funny, which is a first step toward automatically generating humorous headlines as an approach to creating topical humor.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Assessing and Mitigating Miscalibration in LLM-Based Social Science Measurement
cs.AI 2026-05 unverdicted novelty 5.0

LLM confidence for social science text measurements is poorly calibrated across models, and a soft-label distillation pipeline reduces expected calibration error by 43% and Brier score by 34%.