pith. sign in

Adaptive Network Security Policies via Belief Aggregation and Rollout

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it
abstract

Evolving security vulnerabilities and shifting operational conditions require frequent updates to network security policies. These updates include adjustments to incident response procedures and modifications to access controls, among others. Reinforcement learning methods have been proposed for automating such policy adaptations, but most methods in the research literature lack performance guarantees and adapt slowly to changes. In this paper, we address these limitations and present a method for computing security policies that is scalable, offers theoretical guarantees, and adapts quickly to changes. The method uses a model or simulator of the system, which is updated when changes occur, and combines three components: belief estimation through particle filtering, offline policy computation through feature-based aggregation, and online policy adaptation through rollout. In particular, feature-based aggregation enables scalable offline optimization of a policy, while rollout adapts the policy online to changes in the system model without repeating the offline optimization. We analyze the approximation error of the aggregation and show that the rollout efficiently adapts policies to changes under certain conditions. Simulations and testbed results demonstrate that our method outperforms state-of-the-art methods on several benchmarks, including CAGE-2.

fields

eess.SY 1

years

2026 1

verdicts

UNVERDICTED 1

representative citing papers

On-Line Policy Iteration with Trajectory-Driven Policy Generation

eess.SY · 2026-04-16 · unverdicted · novelty 6.0 · 2 refs

An online policy iteration algorithm produces a sequence of monotonically cost-improving policies for fixed-initial-state deterministic control by training each new policy on the trajectory generated by the prior one.

citing papers explorer

Showing 1 of 1 citing paper.

  • On-Line Policy Iteration with Trajectory-Driven Policy Generation eess.SY · 2026-04-16 · unverdicted · none · ref 2 · 2 links · internal anchor

    An online policy iteration algorithm produces a sequence of monotonically cost-improving policies for fixed-initial-state deterministic control by training each new policy on the trajectory generated by the prior one.