ArXiv preprint, abs/2501.01821

Sdpo: Segmentlevel direct preference optimization for social agents · arXiv 2501.01821

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

representative citing papers

SAVOIR: Learning Social Savoir-Faire via Shapley-based Reward Attribution

cs.AI · 2026-04-21 · unverdicted · novelty 6.0

SAVOIR combines prospective expected utility valuation with Shapley values for fair credit assignment in social dialogue RL, achieving SOTA on SOTOPIA where a 7B model matches or exceeds GPT-4o and Claude-3.5-Sonnet.

citing papers explorer

Showing 1 of 1 citing paper.

SAVOIR: Learning Social Savoir-Faire via Shapley-based Reward Attribution cs.AI · 2026-04-21 · unverdicted · none · ref 6
SAVOIR combines prospective expected utility valuation with Shapley values for fair credit assignment in social dialogue RL, achieving SOTA on SOTOPIA where a 7B model matches or exceeds GPT-4o and Claude-3.5-Sonnet.

ArXiv preprint, abs/2501.01821

fields

years

verdicts

representative citing papers

citing papers explorer