Presents a data-driven value iteration algorithm for output-feedback LQR that recovers the optimal state-feedback gain via a non-minimal realization constructed from Kreisselmeier's adaptive filter.
Title resolution pending
5 Pith papers cite this work. Polarity classification is still indexing.
verdicts
UNVERDICTED 5representative citing papers
RS-HyRe-R1 combines three rewards in RL training to overcome perceptual inertia in remote sensing VLMs, achieving SOTA results on REC, OVD, and VQA with a 3B-parameter model that outperforms larger ones.
PRISM-Coach introduces four-view data separation, vault-based identity control, and a privacy-constrained contextual bandit for adaptive peer grouping, reporting higher adherence and weight loss in a 2,800-user commercial deployment.
Safe RL by restricting policies to forward-invariant stabilizing actions, demonstrated on quadcopter hover control.
Lark is a biologically inspired neuroevolution framework for multi-stakeholder LLM agents that iteratively generates, refines, and selects strategies using plasticity, duplication/maturation, influence-weighted Borda scoring, and token penalties, achieving top-3 performance in 80% of 30-round trials
citing papers explorer
-
Data-Driven Linear Quadratic Control Using Output-Feedback via Non-Minimal Realization
Presents a data-driven value iteration algorithm for output-feedback LQR that recovers the optimal state-feedback gain via a non-minimal realization constructed from Kreisselmeier's adaptive filter.
-
RS-HyRe-R1: A Hybrid Reward Mechanism to Overcome Perceptual Inertia for Remote Sensing Images Understanding
RS-HyRe-R1 combines three rewards in RL training to overcome perceptual inertia in remote sensing VLMs, achieving SOTA results on REC, OVD, and VQA with a 3B-parameter model that outperforms larger ones.
-
Privacy-by-Design Adaptive Group Assignment for Digital Lifestyle Coaching at Scale
PRISM-Coach introduces four-view data separation, vault-based identity control, and a privacy-constrained contextual bandit for adaptive peer grouping, reporting higher adherence and weight loss in a 2,800-user commercial deployment.
-
Learning over Forward-Invariant Policy Classes: Reinforcement Learning without Safety Concerns
Safe RL by restricting policies to forward-invariant stabilizing actions, demonstrated on quadcopter hover control.
-
Lark: Biologically Inspired Neuroevolution for Multi-Stakeholder LLM Agents
Lark is a biologically inspired neuroevolution framework for multi-stakeholder LLM agents that iteratively generates, refines, and selects strategies using plasticity, duplication/maturation, influence-weighted Borda scoring, and token penalties, achieving top-3 performance in 80% of 30-round trials