Bellman values for temporal logic tasks decompose into a graph of reach-avoid, avoid, and reach-avoid-loop equations solved by embedding the graph in a two-layer neural net (VDPPO) for safe high-dimensional control.
Dual- objective reinforcement learning with novel hamilton-jacobi-bellman formulations,
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
verdicts
UNVERDICTED 2representative citing papers
Introduces two carbon-aware DRL-based intrusion detection systems for IoT edge gateways, reporting 94% accuracy for a supervised LSTM-DRL model and 98% for a label-free Autoencoder-DRL hybrid.
citing papers explorer
-
Bellman Value Decomposition for Task Logic in Safe Optimal Control
Bellman values for temporal logic tasks decompose into a graph of reach-avoid, avoid, and reach-avoid-loop equations solved by embedding the graph in a two-layer neural net (VDPPO) for safe high-dimensional control.
-
Carbon-Aware Intrusion Detection: A Comparative Study of Supervised and Unsupervised DRL for Sustainable IoT Edge Gateways
Introduces two carbon-aware DRL-based intrusion detection systems for IoT edge gateways, reporting 94% accuracy for a supervised LSTM-DRL model and 98% for a label-free Autoencoder-DRL hybrid.