SVSP applies linear SVM splits to partition state-action data and distill black-box RL policies into fewer interpretable subpolicies, reporting +7.4% higher mean return than Voronoi State Partitioning and +2.8% over the original TD3 policy.
Title resolution pending
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.LG 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Hierarchical Support Vector State Partitioning for Distilling Black Box Reinforcement Learning Policies
SVSP applies linear SVM splits to partition state-action data and distill black-box RL policies into fewer interpretable subpolicies, reporting +7.4% higher mean return than Voronoi State Partitioning and +2.8% over the original TD3 policy.