hub

Journal of Machine Learning Research , year =

Antonin Raffin, Ashley Hill, Adam Gleave, Anssi Kanervisto, Maximilian Ernestus, Noah Dormann , title =

12 Pith papers cite this work. Polarity classification is still indexing.

12 Pith papers citing it

browse 12 citing papers

hub tools

JSON dossier citing papers JSON

citation-role summary

background 1 baseline 1

citation-polarity summary

background 1 baseline 1

representative citing papers

Distributionally Robust Multi-Task Reinforcement Learning via Adaptive Task Sampling

cs.LG · 2026-05-14 · unverdicted · novelty 7.0

DRATS derives a minimax objective from a feasibility formulation of MTRL to adaptively sample tasks with the largest return gaps, leading to better worst-task performance on MetaWorld benchmarks.

Approximation-Free Differentiable Oblique Decision Trees

cs.LG · 2026-05-08 · unverdicted · novelty 7.0

DTSemNet gives an exact, invertible neural-network encoding of hard oblique decision trees that supports direct gradient training for both classification and regression without probabilistic softening or quantized estimators.

Randomness is sometimes necessary for coordination

cs.AI · 2026-05-07 · conditional · novelty 7.0

Structured per-agent randomness via ranked masking in attention allows symmetric agents to break ties and coordinate, achieving perfect success on symmetric tasks where deterministic policies fail and enabling zero-shot transfer across team sizes.

A Multi-View Media Profiling Suite: Resources, Evaluation, and Analysis

cs.CL · 2026-05-02 · unverdicted · novelty 7.0

Presents MBFC-2025 dataset and multi-view embeddings with fusion methods for media bias and factuality, reporting SOTA results on ACL-2020 and new benchmarks on MBFC-2025.

Diffusion Models Are Real-Time Game Engines

cs.LG · 2024-08-27 · conditional · novelty 7.0

A diffusion model trained on DOOM play sessions generates stable real-time interactive game frames at 20 FPS with quality near lossy JPEG.

Integrable Elasticity via Neural Demand Potentials

cs.LG · 2026-05-21 · unverdicted · novelty 6.0

ICDN is a neural network that models log-demand from log-prices so elasticities can be derived exactly by differentiation, showing better out-of-sample performance than log-log benchmarks on beer sales data.

JAXenstein: Accelerated Benchmarking for First-Person Environments

cs.LG · 2026-05-19 · unverdicted · novelty 6.0

JAXenstein ports the Wolfenstein 3D engine to JAX to create a fast, scalable benchmark for first-person visual RL that is several times quicker than existing vision-based alternatives.

SAVGO: Learning State-Action Value Geometry with Cosine Similarity for Continuous Control

cs.LG · 2026-05-01 · unverdicted · novelty 6.0

SAVGO unifies representation learning, value estimation, and policy optimization by embedding state-action pairs such that cosine similarity reflects action-value similarity, enabling similarity-kernel-guided policy improvement.

Odysseus: Scaling VLMs to 100+ Turn Decision-Making in Games via Reinforcement Learning

cs.LG · 2026-05-01 · unverdicted · novelty 6.0

Odysseus adapts PPO with a turn-level critic and leverages pretrained VLM action priors to train agents achieving at least 3x average game progress over frontier models in long-horizon Super Mario Land.

Distributional Off-Policy Evaluation with Deep Quantile Process Regression

stat.ML · 2026-04-20 · unverdicted · novelty 6.0

DQPOPE estimates the entire return distribution in off-policy evaluation via deep quantile process regression, providing statistical advantages over standard single-value methods with equivalent sample sizes.

Kernel-Based Safe Exploration in Deep Reinforcement Learning

eess.SY · 2026-05-21 · unverdicted · novelty 5.0

KBSE learns policies and barrier functions iteratively via conditional mean embeddings to bound unsafe state reachability probabilities during exploration in deep RL.

Learning Material-Aware Hamiltonian Risk Fields for Safe Navigation

cs.LG · 2026-05-07 · unverdicted · novelty 5.0

A learned context-energy term in port-Hamiltonian policies creates selective risk navigation that activates evasive forces only when safer paths are available.

citing papers explorer

Showing 12 of 12 citing papers.

Distributionally Robust Multi-Task Reinforcement Learning via Adaptive Task Sampling cs.LG · 2026-05-14 · unverdicted · none · ref 86
DRATS derives a minimax objective from a feasibility formulation of MTRL to adaptively sample tasks with the largest return gaps, leading to better worst-task performance on MetaWorld benchmarks.
Approximation-Free Differentiable Oblique Decision Trees cs.LG · 2026-05-08 · unverdicted · none · ref 3
DTSemNet gives an exact, invertible neural-network encoding of hard oblique decision trees that supports direct gradient training for both classification and regression without probabilistic softening or quantized estimators.
Randomness is sometimes necessary for coordination cs.AI · 2026-05-07 · conditional · none · ref 7
Structured per-agent randomness via ranked masking in attention allows symmetric agents to break ties and coordinate, achieving perfect success on symmetric tasks where deterministic policies fail and enabling zero-shot transfer across team sizes.
A Multi-View Media Profiling Suite: Resources, Evaluation, and Analysis cs.CL · 2026-05-02 · unverdicted · none · ref 71
Presents MBFC-2025 dataset and multi-view embeddings with fusion methods for media bias and factuality, reporting SOTA results on ACL-2020 and new benchmarks on MBFC-2025.
Diffusion Models Are Real-Time Game Engines cs.LG · 2024-08-27 · conditional · none · ref 44
A diffusion model trained on DOOM play sessions generates stable real-time interactive game frames at 20 FPS with quality near lossy JPEG.
Integrable Elasticity via Neural Demand Potentials cs.LG · 2026-05-21 · unverdicted · none · ref 6
ICDN is a neural network that models log-demand from log-prices so elasticities can be derived exactly by differentiation, showing better out-of-sample performance than log-log benchmarks on beer sales data.
JAXenstein: Accelerated Benchmarking for First-Person Environments cs.LG · 2026-05-19 · unverdicted · none · ref 21
JAXenstein ports the Wolfenstein 3D engine to JAX to create a fast, scalable benchmark for first-person visual RL that is several times quicker than existing vision-based alternatives.
SAVGO: Learning State-Action Value Geometry with Cosine Similarity for Continuous Control cs.LG · 2026-05-01 · unverdicted · none · ref 26
SAVGO unifies representation learning, value estimation, and policy optimization by embedding state-action pairs such that cosine similarity reflects action-value similarity, enabling similarity-kernel-guided policy improvement.
Odysseus: Scaling VLMs to 100+ Turn Decision-Making in Games via Reinforcement Learning cs.LG · 2026-05-01 · unverdicted · none · ref 108
Odysseus adapts PPO with a turn-level critic and leverages pretrained VLM action priors to train agents achieving at least 3x average game progress over frontier models in long-horizon Super Mario Land.
Distributional Off-Policy Evaluation with Deep Quantile Process Regression stat.ML · 2026-04-20 · unverdicted · none · ref 165
DQPOPE estimates the entire return distribution in off-policy evaluation via deep quantile process regression, providing statistical advantages over standard single-value methods with equivalent sample sizes.
Kernel-Based Safe Exploration in Deep Reinforcement Learning eess.SY · 2026-05-21 · unverdicted · none · ref 46
KBSE learns policies and barrier functions iteratively via conditional mean embeddings to bound unsafe state reachability probabilities during exploration in deep RL.
Learning Material-Aware Hamiltonian Risk Fields for Safe Navigation cs.LG · 2026-05-07 · unverdicted · none · ref 44
A learned context-energy term in port-Hamiltonian policies creates selective risk navigation that activates evasive forces only when safer paths are available.

Journal of Machine Learning Research , year =

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer