pith. sign in

Weight ensembling improves reasoning in language models

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

citation-role summary

background 1

citation-polarity summary

fields

cs.LG 2 cs.CL 1

years

2026 3

verdicts

UNVERDICTED 3

roles

background 1

polarities

background 1

representative citing papers

AIPO: Learning to Reason from Active Interaction

cs.CL · 2026-05-08 · unverdicted · novelty 6.0 · 2 refs

AIPO adds active multi-agent consultation (Verify, Knowledge, Reasoning agents) plus custom importance sampling to RLVR training so LLMs expand their reasoning boundary and then operate without the agents.

citing papers explorer

Showing 3 of 3 citing papers.