Title resolution pending

Paul F Christiano, Jan Leike, Tom Brown, Miljan Martic, Shane Legg, Dario Amodei · 2017

8 Pith papers cite this work. Polarity classification is still indexing.

8 Pith papers citing it

browse 8 citing papers

Title metadata for this work has not finished resolving. The hub is built from the citation graph; the title resolver retries DOI and OpenAlex on its next pass.

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

Beyond Localization: A Comprehensive Diagnosis of Perspective-Conditioned Spatial Reasoning in MLLMs from Omnidirectional Images

cs.CV · 2026-05-12 · unverdicted · novelty 7.0 · 3 refs

MLLMs display a large perception-reasoning gap on perspective-conditioned spatial reasoning tasks from omnidirectional images, with sharp accuracy drops on advanced tasks like egocentric rotation, though partial gains are possible via RL reward shaping.

Stayin' Aligned Over Time: Towards Longitudinal Human-LLM Alignment via Contextual Reflection and Privacy-Preserving Behavioral Data

cs.HC · 2026-05-05 · unverdicted · novelty 6.0

A methodological framework and browser system BITE for collecting evolving user preferences on LLM outputs through context-triggered reflections and privacy-preserving data over time.

Vibrotactile Preference Learning: Uncertainty-Aware Preference Learning for Personalized Vibration Feedback

cs.HC · 2026-04-22 · unverdicted · novelty 6.0

VPL learns individualized vibrotactile preferences efficiently via uncertainty-aware Gaussian process models and active query selection in a 13-participant user study on an Xbox controller.

VC-Soup: Value-Consistency Guided Multi-Value Alignment for Large Language Models

cs.LG · 2026-03-18 · unverdicted · novelty 6.0

VC-Soup uses a cosine-similarity consistency metric to filter data, trains value-consistent policies, and applies linear merging with Pareto filtering to improve multi-value LLM alignment trade-offs.

GroupGPT: A Token-efficient and Privacy-preserving Agentic Framework for Multi-User Chat Assistant

cs.CL · 2026-03-01 · unverdicted · novelty 6.0

GroupGPT decouples intervention timing from response generation via edge-cloud collaboration for multi-user chats, scoring 4.72/5 on the new MUIR benchmark of 2500 segments while cutting token use by up to 3x and adding privacy sanitization.

Exploring a Gamified Personality Assessment Method through Interaction with LLM Agents Embodying Different Personalities

cs.HC · 2025-07-05 · unverdicted · novelty 6.0

A gamified system with multiple LLM agents of varied personalities gathers interaction data to produce more effective and interpretable Big Five personality assessments than single-context methods.

Reward-Aware Trajectory Shaping for Few-step Visual Generation

cs.CV · 2026-04-16 · unverdicted · novelty 5.0

RATS lets few-step visual generators surpass multi-step teachers by shaping trajectories with reward-based adaptive guidance instead of strict imitation.

SHE: Stepwise Hybrid Examination Reinforcement Learning Framework for E-commerce Search Relevance

cs.AI · 2025-10-09 · unverdicted · novelty 5.0

SHE is a new RL framework using stepwise hybrid examination rewards to improve reasoning quality and accuracy in large-scale e-commerce query-product relevance prediction.

citing papers explorer

Showing 8 of 8 citing papers.

Beyond Localization: A Comprehensive Diagnosis of Perspective-Conditioned Spatial Reasoning in MLLMs from Omnidirectional Images cs.CV · 2026-05-12 · unverdicted · none · ref 6 · 3 links
MLLMs display a large perception-reasoning gap on perspective-conditioned spatial reasoning tasks from omnidirectional images, with sharp accuracy drops on advanced tasks like egocentric rotation, though partial gains are possible via RL reward shaping.
Stayin' Aligned Over Time: Towards Longitudinal Human-LLM Alignment via Contextual Reflection and Privacy-Preserving Behavioral Data cs.HC · 2026-05-05 · unverdicted · none · ref 6
A methodological framework and browser system BITE for collecting evolving user preferences on LLM outputs through context-triggered reflections and privacy-preserving data over time.
Vibrotactile Preference Learning: Uncertainty-Aware Preference Learning for Personalized Vibration Feedback cs.HC · 2026-04-22 · unverdicted · none · ref 10
VPL learns individualized vibrotactile preferences efficiently via uncertainty-aware Gaussian process models and active query selection in a 13-participant user study on an Xbox controller.
VC-Soup: Value-Consistency Guided Multi-Value Alignment for Large Language Models cs.LG · 2026-03-18 · unverdicted · none · ref 8
VC-Soup uses a cosine-similarity consistency metric to filter data, trains value-consistent policies, and applies linear merging with Pareto filtering to improve multi-value LLM alignment trade-offs.
GroupGPT: A Token-efficient and Privacy-preserving Agentic Framework for Multi-User Chat Assistant cs.CL · 2026-03-01 · unverdicted · none · ref 14
GroupGPT decouples intervention timing from response generation via edge-cloud collaboration for multi-user chats, scoring 4.72/5 on the new MUIR benchmark of 2500 segments while cutting token use by up to 3x and adding privacy sanitization.
Exploring a Gamified Personality Assessment Method through Interaction with LLM Agents Embodying Different Personalities cs.HC · 2025-07-05 · unverdicted · none · ref 24
A gamified system with multiple LLM agents of varied personalities gathers interaction data to produce more effective and interpretable Big Five personality assessments than single-context methods.
Reward-Aware Trajectory Shaping for Few-step Visual Generation cs.CV · 2026-04-16 · unverdicted · none · ref 5
RATS lets few-step visual generators surpass multi-step teachers by shaping trajectories with reward-based adaptive guidance instead of strict imitation.
SHE: Stepwise Hybrid Examination Reinforcement Learning Framework for E-commerce Search Relevance cs.AI · 2025-10-09 · unverdicted · none · ref 3
SHE is a new RL framework using stepwise hybrid examination rewards to improve reasoning quality and accuracy in large-scale e-commerce query-product relevance prediction.

Title resolution pending

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer