pith. sign in

Detecting Referring Expressions in Visually Grounded Dialogue with Autoregressive Language Models

11 Pith papers cite this work. Polarity classification is still indexing.

11 Pith papers citing it

years

2026 11

verdicts

UNVERDICTED 11

clear filters

representative citing papers

Misaligned by Reward: Socially Undesirable Preferences in LLMs

cs.CL · 2026-05-06 · unverdicted · novelty 6.0

Reward models for LLMs frequently select socially undesirable options across four social domains, show no overall best performer, and exhibit a bias-avoidance versus context-sensitivity trade-off.

citing papers explorer

Showing 2 of 2 citing papers after filters.