arXiv preprint arXiv:1810.04586 , year=

The laplacian in rl: Learning representations with efficient approximations , author= · 2018 · cs.LG · arXiv 1810.04586

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

open full Pith review browse 2 citing papers arXiv PDF

abstract

The smallest eigenvectors of the graph Laplacian are well-known to provide a succinct representation of the geometry of a weighted graph. In reinforcement learning (RL), where the weighted graph may be interpreted as the state transition process induced by a behavior policy acting on the environment, approximating the eigenvectors of the Laplacian provides a promising approach to state representation learning. However, existing methods for performing this approximation are ill-suited in general RL settings for two main reasons: First, they are computationally expensive, often requiring operations on large matrices. Second, these methods lack adequate justification beyond simple, tabular, finite-state settings. In this paper, we present a fully general and scalable method for approximating the eigenvectors of the Laplacian in a model-free RL context. We systematically evaluate our approach and empirically show that it generalizes beyond the tabular, finite-state setting. Even in tabular, finite-state settings, its ability to approximate the eigenvectors outperforms previous proposals. Finally, we show the potential benefits of using a Laplacian representation learned using our method in goal-achieving RL tasks, providing evidence that our technique can be used to significantly improve the performance of an RL agent.

representative citing papers

Exploration and Online Transfer with Behavioral Foundation Models

cs.AI · 2026-06-29 · unverdicted · novelty 6.0

Frames online zero-shot transfer with BFMs as a bandit problem and derives an eigenvalue-minimization exploration strategy under linear reward approximation.

QHyer: Q-conditioned Hybrid Attention-mamba Transformer for Offline Goal-conditioned RL

cs.LG · 2026-05-03 · unverdicted · novelty 6.0

QHyer replaces return-to-go with a state-conditioned Q-estimator and adds a gated hybrid attention-mamba backbone to achieve state-of-the-art performance in offline goal-conditioned RL on both Markovian and non-Markovian datasets.

citing papers explorer

Showing 1 of 1 citing paper after filters.

QHyer: Q-conditioned Hybrid Attention-mamba Transformer for Offline Goal-conditioned RL cs.LG · 2026-05-03 · unverdicted · none · ref 214
QHyer replaces return-to-go with a state-conditioned Q-estimator and adds a gated hybrid attention-mamba backbone to achieve state-of-the-art performance in offline goal-conditioned RL on both Markovian and non-Markovian datasets.

arXiv preprint arXiv:1810.04586 , year=

fields

years

verdicts

representative citing papers

citing papers explorer