pith. sign in

Title resolution pending

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

fields

cs.LG 1 cs.MA 1

years

2026 1 2024 1

verdicts

UNVERDICTED 2

representative citing papers

Discovering Implicit Large Language Model Alignment Objectives

cs.LG · 2026-02-17 · unverdicted · novelty 6.0

Obj-Disco decomposes LLM alignment reward signals into sparse weighted combinations of interpretable natural language objectives via iterative analysis of behavioral changes across checkpoints, capturing over 90% of observed reward behavior.

citing papers explorer

Showing 2 of 2 citing papers.

  • Discovering Implicit Large Language Model Alignment Objectives cs.LG · 2026-02-17 · unverdicted · none · ref 5

    Obj-Disco decomposes LLM alignment reward signals into sparse weighted combinations of interpretable natural language objectives via iterative analysis of behavioral changes across checkpoints, capturing over 90% of observed reward behavior.

  • The Hive Mind is a Single Reinforcement Learning Agent cs.MA · 2024-10-23 · unverdicted · none · ref 2

    Bee hive mind from weighted voter imitation equals a single RL agent using a new multi-armed bandit rule called Maynard-Cross Learning.