The final score reported for this task is the average precision of the JSON entries generated by the model

Text-to-SQL- For evaluation metric, each predictions are scored by comparing how many fields, values (entries) in the predicted JSON string match with the entries of ground tr

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

Optimization before Evaluation: Evaluation with Unoptimised Prompts Can be Misleading

cs.AI · 2026-04-30 · unverdicted · novelty 6.0

Prompt optimization per model substantially alters LLM rankings on both public and internal benchmarks compared to using fixed unoptimized prompts.

citing papers explorer

Showing 1 of 1 citing paper.

Optimization before Evaluation: Evaluation with Unoptimised Prompts Can be Misleading cs.AI · 2026-04-30 · unverdicted · none · ref 7
Prompt optimization per model substantially alters LLM rankings on both public and internal benchmarks compared to using fixed unoptimized prompts.

The final score reported for this task is the average precision of the JSON entries generated by the model

fields

years

verdicts

representative citing papers

citing papers explorer