arXiv preprint arXiv:2212.06094 , year=

Luca Beurer-Kellner, Marc Fischer, Martin Vechev · 2022 · arXiv 2212.06094

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

representative citing papers

The Constraint Tax: Measuring Validity-Correctness Tradeoffs in Structured Outputs for Small Language Models

cs.LG · 2026-05-20 · unverdicted · novelty 6.0

Enforcing hard schemas on sub-3B models raises schema validity to 100% but drops answer accuracy from 19.7% to 11.0% and executable accuracy from 91.5% to 48.0% on tool-call tasks.

ART: Automatic multi-step reasoning and tool-use for large language models

cs.CL · 2023-03-16 · unverdicted · novelty 6.0

ART automatically generates multi-step reasoning programs with tool integration for LLMs, yielding substantial gains over few-shot and auto-CoT prompting on BigBench and MMLU while matching hand-crafted CoT on most tasks.

citing papers explorer

Showing 2 of 2 citing papers.

The Constraint Tax: Measuring Validity-Correctness Tradeoffs in Structured Outputs for Small Language Models cs.LG · 2026-05-20 · unverdicted · none · ref 6
Enforcing hard schemas on sub-3B models raises schema validity to 100% but drops answer accuracy from 19.7% to 11.0% and executable accuracy from 91.5% to 48.0% on tool-call tasks.
ART: Automatic multi-step reasoning and tool-use for large language models cs.CL · 2023-03-16 · unverdicted · none · ref 155
ART automatically generates multi-step reasoning programs with tool integration for LLMs, yielding substantial gains over few-shot and auto-CoT prompting on BigBench and MMLU while matching hand-crafted CoT on most tasks.

arXiv preprint arXiv:2212.06094 , year=

fields

years

verdicts

representative citing papers

citing papers explorer