pith. sign in

arxiv: 2606.20880 · v1 · pith:OJ6MIRSOnew · submitted 2026-06-18 · 📊 stat.ML · cs.LG· stat.ME

Adversarial observations in probabilistic State-Space Models for robust Reinforcement Learning

Pith reviewed 2026-06-26 15:03 UTC · model grok-4.3

classification 📊 stat.ML cs.LGstat.ME
keywords adversarial attacksprobabilistic state-space modelsreinforcement learningrobust RLlatent statesobservation perturbationslikelihood constraintsrobotics safety
0
0 comments X

The pith

Adversarial observation shifts that remain model-consistent change latent states and policies in linear state-space RL models.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper analyzes adversarial attacks on linear probabilistic state-space models used in reinforcement learning, where perturbations to observations must satisfy likelihood constraints to appear realistic. These shifts are shown to propagate to the estimated latent states and subsequently influence the agent's policy decisions. Understanding this mechanism matters for developing reinforcement learning agents that can handle sensor noise or deliberate attacks without failing. The analysis points toward methods for improving robustness in applications like robotics.

Core claim

Adversarial yet realistic observation shifts influence the latent state and influence policy decisions in linear probabilistic state-space models for reinforcement learning. This perspective provides a principled pathway toward building more robust reinforcement learning systems, with direct relevance to safety-critical domains such as robotics.

What carries the argument

Likelihood-constrained adversarial perturbations on observations in linear probabilistic state-space models, which affect latent state inference and policy decisions.

If this is right

  • Such attacks can mislead the inference of the environment's latent state.
  • Policy decisions become vulnerable to these consistent observation changes.
  • Robustness in RL can be improved by considering these adversarial effects.
  • Applications in robotics require accounting for sensor noise and attacks.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Extending this analysis to nonlinear or deep state-space models could reveal broader vulnerabilities.
  • Empirical tests on physical robotic platforms would validate the influence on real policies.
  • Designing RL training procedures that simulate such attacks might yield inherently more robust policies.

Load-bearing premise

The attacker alters observations under likelihood constraints that ensure the perturbations remain consistent with the model.

What would settle it

Demonstrating that likelihood-constrained observation perturbations in a linear SSM for RL do not change the inferred latent state or the chosen policy would falsify the central claim.

Figures

Figures reproduced from arXiv: 2606.20880 by D. R\'ios Insua, M. Santos-Pascual.

Figure 1
Figure 1. Figure 1: State space model used in RL. Latent states [PITH_FULL_IMAGE:figures/full_fig_p004_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Impact of the adversarial perturbation at time [PITH_FULL_IMAGE:figures/full_fig_p022_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Effect of the confidence level ϵ and of the attacked time step t on the adversarial perturbation and its impact on state estimation. stronger perturbations also tend to spread more visibly to neighboring state estimates. In addition, changing ϵ slightly alter its perturb direction o adv t − oˆt as Figure 3a illustrates. Con￾sequently, the direction of the most disruptive attack also depends on the likeliho… view at source ↗
Figure 4
Figure 4. Figure 4: Impact of adversarial perturbation at time [PITH_FULL_IMAGE:figures/full_fig_p024_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Estimated density of the most disruptive directions in a three-dimensional observation [PITH_FULL_IMAGE:figures/full_fig_p025_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Cumulative reward over 2000 episodes for the three evaluation settings: ( [PITH_FULL_IMAGE:figures/full_fig_p026_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Example trajectory from a single episode under the three considered scenarios. The [PITH_FULL_IMAGE:figures/full_fig_p027_7.png] view at source ↗
read the original abstract

Decision-making under partial or adversarial observability requires accurate inference of the environment's latent state and its associated uncertainty. This work analyses adversarial attacks on linear probabilistic state-space models, commonly integrated within reinforcement learning architectures, where the attacker alters observations under likelihood constraints that ensure the perturbations remains consistent. We analyze how such adversarial yet realistic observation shifts influence the latent state and influence policy decisions. This perspective provides a principled pathway toward building more robust reinforcement learning systems, with direct relevance to safety-critical domains such as robotics, where reliable operation under sensor noise, partial failures, and adversarial conditions is essential.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 0 minor

Summary. The manuscript claims that in linear probabilistic state-space models integrated with reinforcement learning, an attacker can alter observations subject to likelihood constraints (ensuring perturbations remain consistent with the model) and that these realistic adversarial shifts still influence the inferred latent state and downstream policy decisions. The work positions this analysis as a pathway to more robust RL systems, particularly for safety-critical domains.

Significance. The topic addresses an important intersection of adversarial robustness, state estimation, and RL. If the central claim were supported by explicit derivations through the filtering recursions and by reproducible experiments showing policy degradation under likelihood-constrained attacks, it would be relevant to safety-critical applications. However, the manuscript supplies no such derivations, experiments, or quantitative results, so the significance cannot be evaluated.

major comments (2)
  1. [Abstract] Abstract and entire manuscript: the central claim—that likelihood-constrained observation perturbations shift the latent state and alter policy decisions—is stated but never demonstrated. No filtering equations, attack formulation, or propagation analysis is provided, rendering the claim unsupported.
  2. No section, table, or equation supplies the linear SSM dynamics, the observation likelihood used for the constraint, the filtering update, or the policy mapping. Without these, it is impossible to verify whether the perturbations remain inside the model or produce the claimed downstream effect.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback. The comments correctly note that the submitted manuscript lacks explicit equations, derivations, and experimental demonstrations of the central claims. We will revise the paper to address these gaps by adding the required mathematical details and supporting analyses.

read point-by-point responses
  1. Referee: [Abstract] Abstract and entire manuscript: the central claim—that likelihood-constrained observation perturbations shift the latent state and alter policy decisions—is stated but never demonstrated. No filtering equations, attack formulation, or propagation analysis is provided, rendering the claim unsupported.

    Authors: We agree that the current version does not demonstrate the claim through explicit derivations. In the revised manuscript we will add a dedicated technical section deriving the linear probabilistic SSM, formulating the likelihood-constrained adversarial observation shift, showing the filtering recursion updates, and tracing the propagation to the posterior latent state and the downstream policy. This will make the central claim explicit and verifiable. revision: yes

  2. Referee: [—] No section, table, or equation supplies the linear SSM dynamics, the observation likelihood used for the constraint, the filtering update, or the policy mapping. Without these, it is impossible to verify whether the perturbations remain inside the model or produce the claimed downstream effect.

    Authors: This observation is accurate for the submitted draft. The revision will include the precise linear SSM state and observation equations, the form of the observation likelihood used to enforce the constraint, the filtering update equations, and the mapping from filtered latent states to policy actions. These additions will allow direct verification that the perturbations stay model-consistent and affect the policy. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The abstract and provided context describe an analysis that defines 'realistic' adversarial perturbations via explicit likelihood constraints and then traces their effects on latent states through standard filtering equations in linear probabilistic SSMs. No equations, derivations, self-citations, or fitted parameters are shown that reduce the central claim to its own inputs by construction. The setup is self-contained against the model dynamics without renaming known results or smuggling ansatzes. This matches the default expectation of no circularity.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Only abstract available; no information on free parameters, axioms, or invented entities.

pith-pipeline@v0.9.1-grok · 5627 in / 857 out tokens · 17689 ms · 2026-06-26T15:03:27.897198+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

46 extracted references · 14 canonical work pages · 1 internal anchor

  1. [1]

    BARRENO, M., NELSON, B., JOSEPH, A. D. and TYGAR, J. D. (2006). Can Machine Learning Be Secure? In Proceedings of the ACM Symposium on Information, Computer and Communications Security (ASIACCS)

  2. [2]

    and MUNIR, A

    BEHZADAN, V. and MUNIR, A. (2017). Vulnerability of Deep Reinforcement Learning to Policy Induction At- tacks.arXiv preprint

  3. [3]

    Learning long-term dependencies with gradient descent is difficult

    BENGIO, Y., SIMARD, P. and FRASCONI, P. (1994). Learning Long-Term Dependencies with Gradient Descent Is Difficult.IEEE Transactions on Neural Networks5157–166. https://doi.org/10.1109/72.279181

  4. [4]

    and ROLI, F

    BIGGIO, B. and ROLI, F. (2018). Wild Patterns: Ten Years After the Rise of Adversarial Machine Learning. Pattern Recognition84317–331. https://doi.org/10.1016/j.patcog.2018.07.023

  5. [5]

    and VANDENBERGHE, L

    BOYD, S. and VANDENBERGHE, L. (2004).Convex Optimization. Cambridge University Press, Cambridge, UK

  6. [6]

    W., WILSON, D

    CAMERON, F., BEQUETTE, B. W., WILSON, D. M., BUCKINGHAM, B. A., LEE, H. and NIEMEYER, G. (2011). A Closed-Loop Artificial Pancreas Based on Risk Management.Journal of Diabetes Science and Technology 5368–379. https://doi.org/10.1177/193229681100500226

  7. [7]

    R., GOULD, N

    CONN, A. R., GOULD, N. I. M. and TOINT, P. L. (2000).Trust-Region Methods. Society for Industrial and Applied Mathematics (SIAM), Philadelphia, PA

  8. [8]

    ABDOLMALEKI, A.,DELASCASAS, D. et al. (2022). Magnetic control of tokamak plasmas through deep reinforcement learning.Nature602414–419. 28

  9. [9]

    and VYATKIN, V

    DENG, J., SIERLA, S., SUN, J. and VYATKIN, V. (2023). Offline Reinforcement Learning for Industrial Process Control: A Case Study from Steel Industry.Information Sciences632221–231. https://doi.org/10.1016/j.ins. 2023.03.019

  10. [10]

    and MURPHY, K

    KNOBLAUCH, J., JONES, M., BRIOL, F.-X. and MURPHY, K. P. (2024). Outlier-Robust Kalman Fil- tering through Generalised Bayes. InProceedings of the 41st International Conference on Machine Learning. Proceedings of Machine Learning Research23512138–12171

  11. [11]

    DURAN-MARTIN, G., SÁNCHEZ-BETANCOURT, L., SHESTOPALOFF, A. Y. and MURPHY, K. P. (2025). A Uni- fying Framework for Generalised Bayesian Online Learning in Non-Stationary Environments.Transactions on Machine Learning Research

  12. [12]

    and ZHENG, W

    FANG, C., QI, Y., CHEN, J., TAN, R. and ZHENG, W. X. (2020). Stealthy Actuator Signal Attacks in Stochas- tic Control Systems: Performance and Limitations.IEEE Transactions on Automatic Control653927–3934. https://doi.org/10.1109/TAC.2019.2950072

  13. [13]

    W., KOLLMAN, C., WOODALL, W

    RIA, L., SWANSON, V., LUM, J. W., KOLLMAN, C., WOODALL, W. and BECK, R. W. (2018). Predictive Low-Glucose Suspend Reduces Hypoglycemia in Adults, Adolescents, and Children With Type 1 Diabetes in an At-Home Randomized Crossover Study: Results of the PROLOG Trial.Diabetes Care412155–2161. https://doi.org/10.2337/dc18-0771 GARCÍA, J. and FERNÁNDEZ, F. (2015...

  14. [14]

    Gaudet, R

    GAUDET, B., LINARES, R. and FURFARO, R. (2020). Deep Reinforcement Learning for Six Degree-of-Freedom Planetary Landing.Advances in Space Research651723–1741. https://doi.org/10.1016/j.asr.2019.12.030

  15. [15]

    and RUSSELL, S

    GLEAVE, A., DENNIS, M., WILD, C., KANT, N., LEVINE, S. and RUSSELL, S. (2020). Adversarial Policies: Attacking Deep Reinforcement Learning. InInternational Conference on Learning Representations (ICLR)

  16. [16]

    J., SHLENS, J

    GOODFELLOW, I. J., SHLENS, J. and SZEGEDY, C. (2015). Explaining and Harnessing Adversarial Examples. InInternational Conference on Learning Representations (ICLR)

  17. [17]

    and DAO, T

    GU, A. and DAO, T. (2023). Mamba: Linear-Time Sequence Modeling with Selective State Spaces.arXiv preprint

  18. [18]

    and RÉ, C

    GU, A., GOEL, K. and RÉ, C. (2022). Efficiently Modeling Long Sequences with Structured State Spaces. In International Conference on Learning Representations (ICLR)

  19. [19]

    and NOROUZI, M

    HAFNER, D., LILLICRAP, T., BA, J. and NOROUZI, M. (2020). Dream to Control: Learning Behaviors by Latent Imagination. InInternational Conference on Learning Representations (ICLR)

  20. [20]

    and WEST, M

    HARRISON, J. and WEST, M. (1991). Dynamic Linear Model Diagnostics.Biometrika78797–808. https://doi. org/10.1093/biomet/78.4.797

  21. [21]

    HINTON, G. E. (2002). Training Products of Experts by Minimizing Contrastive Divergence.Neural Computation 141771–1800

  22. [22]

    and ABBEEL, P

    HUANG, S., PAPERNOT, N., GOODFELLOW, I., DUAN, Y. and ABBEEL, P. (2017). Adversarial Attacks on Neural Network Policies.arXiv preprint

  23. [23]

    P., LITTMAN, M

    KAELBLING, L. P., LITTMAN, M. L. and CASSANDRA, A. R. (1998). Planning and Acting in Partially Observ- able Stochastic Domains.Artificial Intelligence10199–134

  24. [24]

    and HASSIBI, B

    KARGIN, T., HAJAR, J., MALIK, V. and HASSIBI, B. (2024). Distributionally Robust Kalman Filtering over Finite and Infinite Horizon

  25. [25]

    and SCARAMUZZA, D

    KAUFMANN, E., BAUERSFELD, L., LOQUERCIO, A., MÜLLER, M., KOLTUN, V. and SCARAMUZZA, D. (2023). Champion-level drone racing using deep reinforcement learning.Nature620982–987

  26. [26]

    and LI, W

    KIOURTI, P., WARDEGA, K., JHA, S. and LI, W. (2020). TrojDRL: Evaluation of Backdoor Attacks on Deep Reinforcement Learning. InProceedings of the 57th ACM/IEEE Design Automation Conference1–6. IEEE

  27. [27]

    and CLOSAS, P

    LI, H., MEDINA, D., VILÀ-VALLS, J. and CLOSAS, P. (2021). Robust Variational-Based Kalman Filter for Outlier Rejection With Correlated Measurements.IEEE Transactions on Signal Processing69357–369. https: //doi.org/10.1109/TSP.2020.3042944

  28. [28]

    and BEHBAHANI, F

    LU, C., SCHROECKER, Y., GU, A., PARISOTTO, E., FOERSTER, J., SINGH, S. and BEHBAHANI, F. (2023). Structured State Space Models for In-Context Reinforcement Learning. InAdvances in Neural Information Processing Systems3647016–47031

  29. [29]

    E., BOTTERO, A

    LUIS, C. E., BOTTERO, A. G., VINOGRADSKA, J., BERKENKAMP, F. and PETERS, J. (2024). Uncertainty Representations in State-Space Layers for Deep Reinforcement Learning under Partial Observability.arXiv preprint

  30. [30]

    and VLADU, A

    MADRY, A., MAKELOV, A., SCHMIDT, L., TSIPRAS, D. and VLADU, A. (2018). Towards Deep Learning Mod- els Resistant to Adversarial Attacks. InInternational Conference on Learning Representations (ICLR)

  31. [31]

    MURPHY, K. P. (2023).Probabilistic Machine Learning: Advanced Topics. MIT Press. 29

  32. [32]

    and WRIGHT, S

    NOCEDAL, J. and WRIGHT, S. J. (2006).Numerical Optimization, 2 ed. Springer, New York, NY

  33. [33]

    and CAMPAGNOLI, P

    PETRIS, G., PETRONE, S. and CAMPAGNOLI, P. (2009).Dynamic Linear Models with R. Springer Science & Business Media

  34. [34]

    and SUKTHANKAR, R

    PINTO, L., DAVIDSON, J. and SUKTHANKAR, R. (2017). Robust Adversarial Reinforcement Learning. InInter- national Conference on Machine Learning (ICML). QUIÑONERO-CANDELA, J., SUGIYAMA, M., SCHWAIGHOFER, A. and LAWRENCE, N. D., eds. (2009).Dataset Shift in Machine Learning. MIT Press

  35. [35]

    and AMATO, C

    RATHBUN, E., OPREA, A. and AMATO, C. (2025). Adversarial Inception Backdoor Attacks against Reinforce- ment Learning. InProceedings of the 42nd International Conference on Machine Learning.Proceedings of Machine Learning Research26751273–51296. PMLR

  36. [36]

    E., TUNG, F

    RAUCH, H. E., TUNG, F. and STRIEBEL, C. T. (1965). Maximum likelihood estimates of linear dynamic sys- tems.AIAA Journal31445–1450. https://doi.org/10.2514/3.3166

  37. [37]

    Robust Bayesian Filtering and Smoothing Using Student's t Distribution

    ROTH, M., ARDESHIRI, T., ÖZKAN, E. and GUSTAFSSON, F. (2017). Robust Bayesian Filtering and Smoothing Using Student’s t Distribution. https://doi.org/10.48550/arXiv.1703.02428 SÄRKKÄ, S. and HARTIKAINEN, J. (2013). Variational Bayesian Adaptation of Noise Covariances in Non-Linear Kalman Filtering. https://doi.org/10.48550/arXiv.1302.0681 SÄRKKÄ, S. and N...

  38. [38]

    A., KUHN, D

    SHAFIEEZADEH-ABADEH, S., NGUYEN, V. A., KUHN, D. and MOHAJERINESFAHANI, P. (2018). Wasserstein Distributionally Robust Kalman Filtering. InAdvances in Neural Information Processing Systems31

  39. [39]

    (2005).Stochastic Volatility: Selected Readings

    SHEPHARD, N., ed. (2005).Stochastic Volatility: Selected Readings. Oxford University Press

  40. [40]

    TABUADA, P. (2017). Secure State Estimation for Cyber-Physical Systems under Sensor Attacks: A Satis- fiability Modulo Theory Approach.IEEE Transactions on Automatic Control624917–4932. https://doi.org/ 10.1109/TAC.2017.2650223

  41. [41]

    and LINDERMAN, S

    SMITH, J., WARRINGTON, A. and LINDERMAN, S. W. (2023). Simplified State Space Layers for Sequence Modeling. InInternational Conference on Learning Representations (ICLR)

  42. [42]

    M., MIMI, M

    SOMVANSHI, S., ISLAM, M. M., MIMI, M. S., POLOCK, S. B. B., CHHETRI, G., DUTTA, A., RAFE, A. and DAS, S. (2025). Advancing Intelligent Sequence Modeling: Evolution, Trade-offs, and Applications of State-Space Architectures from S4 to Mamba.arXiv preprint arXiv:2503.18970

  43. [43]

    VASSILEV, A., OPREA, A. et al. (2024). Adversarial Machine Learning: A Taxonomy and Terminology of At- tacks and Mitigations NIST AI Report No. NIST.AI.100-2e2023, National Institute of Standards and Technol- ogy (NIST). https://doi.org/10.6028/NIST.AI.100-2e2023

  44. [44]

    POLOSUKHIN, I. (2017). Attention Is All You Need. InAdvances in Neural Information Processing Systems (NeurIPS)

  45. [45]

    and WANG, H

    WANG, H., LI, H., FANG, J. and WANG, H. (2018). Robust Gaussian Kalman Filter With Outlier Detection. IEEE Signal Processing Letters251236–1240. https://doi.org/10.1109/LSP.2018.2851156

  46. [46]

    and HARRISON, J

    WEST, M. and HARRISON, J. (1997).Bayesian Forecasting and Dynamic Models. Springer Science & Business Media