Adams, Tyler Cody, and Peter A

A survey of inverse reinforcement learning , volume = · 2022 · DOI 10.1007/s10462-021-10108-x

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

open at publisher browse 4 citing papers

citation-role summary

background 1

citation-polarity summary

support 1

representative citing papers

Scouting By Reward: VLM-TO-IRL-Driven Player Selection For Esports

cs.LG · 2026-04-15 · unverdicted · novelty 7.0

A multimodal VLM-TO-IRL framework with GAIL learns professional-specific reward functions from telemetry and tactical commentary to rank esports players by stylistic alignment.

Understanding Goal Generalisation in Sequential Reinforcement Learning

cs.LG · 2026-05-22 · unverdicted · novelty 6.0

Empirical analysis of over 100 sequential RL training pipelines across 250+ OOD environments finds salient features drive generalization and early goals persist, with latent policy gradients simulating latent variable evolution to predict OOD behavior from training history.

Robust Instruction Compliance in Cooperative Multi-Agent Reinforcement Learning

cs.AI · 2026-05-12 · unverdicted · novelty 6.0

MAVIC corrects Bellman backups at instruction boundaries by adjusting the incoming objective and restoring continuation value, enabling consistent estimation under stochastic instruction switching in cooperative MARL.

AI Safety Landscape for Large Language Models: Taxonomy, State-of-the-art, and Future Directions

cs.AI · 2024-08-23 · unverdicted · novelty 4.0

The paper introduces a taxonomy of AI safety for LLMs organized into Trustworthy AI, Responsible AI, and Safe AI perspectives, accompanied by a review of state-of-the-art methods, challenges, and future directions.

citing papers explorer

Showing 2 of 2 citing papers after filters.

Robust Instruction Compliance in Cooperative Multi-Agent Reinforcement Learning cs.AI · 2026-05-12 · unverdicted · none · ref 105
MAVIC corrects Bellman backups at instruction boundaries by adjusting the incoming objective and restoring continuation value, enabling consistent estimation under stochastic instruction switching in cooperative MARL.
AI Safety Landscape for Large Language Models: Taxonomy, State-of-the-art, and Future Directions cs.AI · 2024-08-23 · unverdicted · none · ref 5
The paper introduces a taxonomy of AI safety for LLMs organized into Trustworthy AI, Responsible AI, and Safe AI perspectives, accompanied by a review of state-of-the-art methods, challenges, and future directions.

Adams, Tyler Cody, and Peter A

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer