CodeBlock partitions code responses into syntactically coherent blocks, scores them with generalized cross-entropy and data-flow signals, and applies sparse supervision to achieve higher pass@1 than full SFT using 1.9% of tokens on six benchmarks.
Title resolution pending
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2verdicts
UNVERDICTED 2representative citing papers
MADS selects a 15% core set from the 52K Alpaca-GPT4 dataset via activations in Llama-3.2-3B-Instruct, yielding 2.5% average gains on 7B-13B models across six benchmarks versus full-data training.
citing papers explorer
-
CODEBLOCK: Learning to Supervise Code at the Right Granularity
CodeBlock partitions code responses into syntactically coherent blocks, scores them with generalized cross-entropy and data-flow signals, and applies sparse supervision to achieve higher pass@1 than full SFT using 1.9% of tokens on six benchmarks.
-
MADS: Model-Aware Diverse Core Set Selection for Instruction Tuning
MADS selects a 15% core set from the 52K Alpaca-GPT4 dataset via activations in Llama-3.2-3B-Instruct, yielding 2.5% average gains on 7B-13B models across six benchmarks versus full-data training.