response1_correct

Determine preference based on correctness: - If only one response is correct, prefer the correct one - If both responses are correct, prefer the one with better reasoning/explanati

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

CoAct: Co-Active LLM Preference Learning with Human-AI Synergy

cs.CL · 2026-04-19 · unverdicted · novelty 6.0

CoAct synergistically merges self-rewarding and active learning via self-consistency to select reliable AI labels and oracle-needed samples, delivering 8-13% gains on GSM8K, MATH, and WebInstruct.

citing papers explorer

Showing 1 of 1 citing paper.

CoAct: Co-Active LLM Preference Learning with Human-AI Synergy cs.CL · 2026-04-19 · unverdicted · none · ref 13
CoAct synergistically merges self-rewarding and active learning via self-consistency to select reliable AI labels and oracle-needed samples, delivering 8-13% gains on GSM8K, MATH, and WebInstruct.

response1_correct

fields

years

verdicts

representative citing papers

citing papers explorer