Compilation and linguistic analysis of 129 LLM prompt datasets identifies distinguishing features, with syntactic distributions enabling high-accuracy lightweight routing and quality prediction in three downstream tasks.
It comprises 8 million formal statements and corresponding proofs generated from high-school and undergraduate-level mathematical contest problems
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.LG 1years
2025 1verdicts
ACCEPT 1representative citing papers
citing papers explorer
-
Large Language Model Prompt Datasets: An In-depth Analysis and Insights
Compilation and linguistic analysis of 129 LLM prompt datasets identifies distinguishing features, with syntactic distributions enabling high-accuracy lightweight routing and quality prediction in three downstream tasks.