Compilation and linguistic analysis of 129 LLM prompt datasets identifies distinguishing features, with syntactic distributions enabling high-accuracy lightweight routing and quality prediction in three downstream tasks.
It includes paired comparison data from base and iterated models, as well as red teaming transcripts designed to expose model vulnerabilities
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.LG 1years
2025 1verdicts
ACCEPT 1representative citing papers
citing papers explorer
-
Large Language Model Prompt Datasets: An In-depth Analysis and Insights
Compilation and linguistic analysis of 129 LLM prompt datasets identifies distinguishing features, with syntactic distributions enabling high-accuracy lightweight routing and quality prediction in three downstream tasks.