A Bayesian expert selection framework with variational Bayesian last layers and lower confidence bounds improves diffusion policies for active multi-target tracking.
InAdvances in Neural Information Pro- cessing Systems, volume 28, 2944–2952
4 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
verdicts
UNVERDICTED 4roles
background 3polarities
background 3representative citing papers
TOPPO reformulates PPO with critic balancing to address gradient ill-conditioning in multi-task RL and reports stronger mean and tail performance than SAC baselines on Meta-World+ using fewer parameters and steps.
FLAME is an MoE architecture using modality-specific routers and low-rank compression of expert knowledge to support efficient continual multimodal multi-task learning while reducing catastrophic forgetting.
PRISM-WM uses a context-aware MoE with latent orthogonalization to model hybrid dynamics and reduce rollout drift for model-based planning.
citing papers explorer
-
Diffusion Policy with Bayesian Expert Selection for Active Multi-Target Tracking
A Bayesian expert selection framework with variational Bayesian last layers and lower confidence bounds improves diffusion policies for active multi-target tracking.
-
TOPPO: Rethinking PPO for Multi-Task Reinforcement Learning with Critic Balancing
TOPPO reformulates PPO with critic balancing to address gradient ill-conditioning in multi-task RL and reports stronger mean and tail performance than SAC baselines on Meta-World+ using fewer parameters and steps.
-
FLAME: Adaptive Mixture-of-Experts for Continual Multimodal Multi-Task Learning
FLAME is an MoE architecture using modality-specific routers and low-rank compression of expert knowledge to support efficient continual multimodal multi-task learning while reducing catastrophic forgetting.
-
Prismatic World Model: Learning Compositional Dynamics for Planning in Hybrid Systems
PRISM-WM uses a context-aware MoE with latent orthogonalization to model hybrid dynamics and reduce rollout drift for model-based planning.