A demographically and topically split Reddit dataset called Splits! is constructed and validated to support scalable, flexible investigation of sociocultural linguistic phenomena via a two-stage filtering process for promising candidates.
### Input: Target demographic: {target} Contrast demographic: {contrast} Topic: {topic} Figure 19: Prompt when both demographics and topic is given to generate creative lexicon
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CL 1years
2025 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Splits! Flexible Sociocultural Linguistic Investigation at Scale
A demographically and topically split Reddit dataset called Splits! is constructed and validated to support scalable, flexible investigation of sociocultural linguistic phenomena via a two-stage filtering process for promising candidates.