Deep Reinforcement Learning for Spacecraft Attitude Control During Atmospheric Re-Entry

Alexander Fabisch; Edoardo Caroselli; Julian Theis; Mariela De Lucas \'Alvarez; Melvin Laux

arxiv: 2606.31291 · v1 · pith:JCGCA3FBnew · submitted 2026-06-30 · 💻 cs.LG

Deep Reinforcement Learning for Spacecraft Attitude Control During Atmospheric Re-Entry

Alexander Fabisch , Melvin Laux , Mariela De Lucas \'Alvarez , Edoardo Caroselli , Julian Theis This is my paper

Pith reviewed 2026-07-01 06:12 UTC · model grok-4.3

classification 💻 cs.LG

keywords reinforcement learningspacecraft attitude controlatmospheric re-entryhybrid controldynamics randomizationPID controllerrobustness to parameter variation

0 comments

The pith

Hybrid controllers that combine reinforcement learning with a PID baseline track the angle of attack more accurately during spacecraft re-entry and stay stable when mass, inertia, or flap actuator speed change.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper tests deep reinforcement learning for attitude control during atmospheric re-entry and compares it to an industry-standard gain-scheduled PID controller. Pure RL reaches similar performance inside the training distribution but fails to generalize outside it. Adding dynamics randomization during training and combining the RL policy with the PID controller produces hybrid systems that, inside a chosen operational envelope, follow the required angle of attack more closely than the baseline and keep performance when mass, inertia tensor, and actuator bandwidth vary.

Core claim

State-of-the-art continuous off-policy reinforcement learning matches the performance of a gain-scheduled PID controller on the re-entry attitude task, yet its out-of-distribution generalization remains insufficient. Dynamics randomization during training, together with a hybrid RL-plus-PID architecture, yields controllers that track the angle of attack better and exhibit greater robustness to variations in mass, inertia tensor, and flap actuator bandwidth inside the predefined operational envelope.

What carries the argument

Hybrid RL-PID controller trained with dynamics randomization to enforce generalization inside a fixed operational envelope.

If this is right

Hybrid RL-PID controllers achieve tighter angle-of-attack tracking than pure PID inside the operational envelope.
The same controllers maintain performance when spacecraft mass or inertia tensor changes within the tested range.
They remain effective when flap actuator bandwidth varies inside the envelope.
Dynamics randomization during training is what enables the observed generalization improvement over plain RL.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same training recipe could be applied to other attitude-control problems that involve nonlinear aerodynamics and parameter uncertainty.
If the operational envelope proves too narrow for real missions, the method would require either larger randomization ranges or online adaptation after deployment.
Replacing the fixed PID baseline with an adaptive classical controller might further improve the hybrid result without extra simulation cost.

Load-bearing premise

The simulation environment and the predefined operational envelope sufficiently capture the relevant real-world uncertainties and failure modes that would be encountered during actual atmospheric re-entry.

What would settle it

A higher-fidelity simulation or flight test in which the hybrid controller loses angle-of-attack tracking or becomes unstable under a mass or inertia change that lies inside the claimed operational envelope would falsify the robustness claim.

Figures

Figures reproduced from arXiv: 2606.31291 by Alexander Fabisch, Edoardo Caroselli, Julian Theis, Mariela De Lucas \'Alvarez, Melvin Laux.

**Figure 2.** Figure 2: MR.Q surpasses the baseline controller. Learning curves with interquartile means (IQM) [PITH_FULL_IMAGE:figures/full_fig_p008_2.png] view at source ↗

**Figure 3.** Figure 3: Comparison under nominal conditions. With the learning curves in [PITH_FULL_IMAGE:figures/full_fig_p008_3.png] view at source ↗

**Figure 4.** Figure 4: Comparisons of control architectures. With MR.Q, we compare control architectures (see [PITH_FULL_IMAGE:figures/full_fig_p009_4.png] view at source ↗

**Figure 5.** Figure 5: Generalization of policies trained under nominal conditions. The solid lines show median [PITH_FULL_IMAGE:figures/full_fig_p010_5.png] view at source ↗

**Figure 6.** Figure 6: Learning curves of three training runs with DR and pure RL with MR.Q. Lines indicate median performance over 30 test contexts and shaded areas show the full range of returns. Task scheduling methods do not perform better than dynamics randomization or round robin. Since the results are inconclusive, we report them in Appendix L. Among task selection strategies, focusing on hard tasks in the beginning (SM… view at source ↗

**Figure 7.** Figure 7: Sensitivity analysis of for weights in the reward function. [PITH_FULL_IMAGE:figures/full_fig_p021_7.png] view at source ↗

**Figure 8.** Figure 8: Comparison of commanded and measured aerodynamic angles for pure RL controller [PITH_FULL_IMAGE:figures/full_fig_p028_8.png] view at source ↗

**Figure 9.** Figure 9: Comparison of commanded and measured aerodynamic angles for pure RL controller [PITH_FULL_IMAGE:figures/full_fig_p029_9.png] view at source ↗

**Figure 10.** Figure 10: Control commands for pure RL controller based on MR.Q trained under nominal condi [PITH_FULL_IMAGE:figures/full_fig_p030_10.png] view at source ↗

**Figure 11.** Figure 11: Distributions of aerodynamic angle errors. The results are obtained for the best policy [PITH_FULL_IMAGE:figures/full_fig_p031_11.png] view at source ↗

**Figure 12.** Figure 12: Distributions of control commands. The results are obtained for the best policy and 10 [PITH_FULL_IMAGE:figures/full_fig_p031_12.png] view at source ↗

**Figure 13.** Figure 13: Distribution of aerodynamic angle errors evaluated over 100 test contexts. The overlap [PITH_FULL_IMAGE:figures/full_fig_p032_13.png] view at source ↗

**Figure 14.** Figure 14: Individual learning curves of MR.Q (Only RL). [PITH_FULL_IMAGE:figures/full_fig_p034_14.png] view at source ↗

**Figure 15.** Figure 15: Individual learning curves of TD7 (Only RL). [PITH_FULL_IMAGE:figures/full_fig_p034_15.png] view at source ↗

**Figure 16.** Figure 16: Individual learning curves of TD3 (Only RL). [PITH_FULL_IMAGE:figures/full_fig_p034_16.png] view at source ↗

**Figure 17.** Figure 17: Individual learning curves of SAC (Only RL). [PITH_FULL_IMAGE:figures/full_fig_p035_17.png] view at source ↗

**Figure 18.** Figure 18: Individual learning curves of MR.Q (Additive Hybrid). [PITH_FULL_IMAGE:figures/full_fig_p035_18.png] view at source ↗

**Figure 19.** Figure 19: Individual learning curves of TD7 (Additive Hybrid). [PITH_FULL_IMAGE:figures/full_fig_p035_19.png] view at source ↗

**Figure 20.** Figure 20: Individual learning curves of TD3 (Additive Hybrid). [PITH_FULL_IMAGE:figures/full_fig_p036_20.png] view at source ↗

**Figure 21.** Figure 21: Individual learning curves of SAC (Additive Hybrid). [PITH_FULL_IMAGE:figures/full_fig_p036_21.png] view at source ↗

**Figure 22.** Figure 22: Individual learning curves of MR.Q (Gain-Scheduling Hybrid). [PITH_FULL_IMAGE:figures/full_fig_p036_22.png] view at source ↗

read the original abstract

Deep reinforcement learning has the potential to solve attitude control problems more adaptively, precisely, and robustly by handling nonlinear dynamics, uncertainties, and failure cases more effectively than traditional attitude control approaches. We explore reinforcement learning (RL) for attitude control in spacecraft re-entry. An industry-standard proportional-integral-derivative controller with gain scheduling serves as a strong baseline for model-free RL and hybrid controllers that combine these two approaches. We formalize the application in the RL framework to apply continuous, off-policy RL. State-of-the-art RL achieves comparable performance to traditional control approaches in this domain. However, its out-of-distribution generalization is not sufficient. Hence, we use dynamics randomization to introduce challenging task variations during training and enforce generalization in a predefined operational envelope. Finally, we assess the best obtained RL-based controllers with application-specific metrics to show superior performance in comparison to traditional controllers in the operational envelope, that is, hybrid controllers are able to track the angle of attack better and are more robust under variations of mass, inertia tensor, and flap actuator bandwidth.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Hybrid RL-PID beats the gain-scheduled baseline on tracking and robustness inside the randomized sim envelope, but the result stays provisional without numbers or envelope validation.

read the letter

The paper's main point is that hybrid controllers mixing RL with PID track angle of attack better and handle variations in mass, inertia, and flap bandwidth more robustly than a standard gain-scheduled PID, but only after dynamics randomization inside a predefined operational envelope.

The work takes established off-policy continuous RL, formalizes the re-entry attitude task for it, and adds randomization during training because pure RL did not generalize well. Comparing against an industry PID baseline on application-specific metrics is a sensible choice and gives the results some practical relevance.

The soft spot is the simulation envelope itself. The abstract mentions no check against flight data, higher-fidelity models, or expert-identified failure modes, so the robustness numbers rest on whether the chosen randomization ranges actually cover the uncertainties that matter in real re-entry. If density fluctuations or sensor effects fall outside those ranges, the reported advantage does not transfer. The abstract also gives no quantitative results or training details, which leaves the size of the improvement unclear.

This is incremental applied work aimed at aerospace control engineers or RL groups working on physical systems. It is not foundational, but the concrete domain and strong baseline make it worth checking.

I would send it to peer review. The application is real and the hybrid idea is worth testing with the full numbers and any sim validation the authors can supply.

Referee Report

2 major / 1 minor

Summary. The paper investigates the application of deep reinforcement learning (RL) to spacecraft attitude control during atmospheric re-entry. It compares model-free RL and hybrid RL combined with PID controllers against a traditional PID with gain scheduling baseline. By employing dynamics randomization during training to enforce generalization within a predefined operational envelope, the authors claim that hybrid controllers outperform traditional approaches in tracking the angle of attack and in robustness to variations in mass, inertia tensor, and flap actuator bandwidth.

Significance. If the empirical results hold under proper validation of the simulation envelope, this work could demonstrate the viability of hybrid RL-PID controllers for handling nonlinear dynamics and uncertainties in re-entry scenarios, potentially improving adaptability over classical methods. The use of dynamics randomization is a positive step toward better generalization.

major comments (2)

Abstract: The abstract asserts superior performance on application-specific metrics and robustness but provides no quantitative results, error bars, training details, or specific metrics, making the central claim impossible to evaluate from the given text.
Abstract: The robustness claims depend on the simulation environment and predefined operational envelope capturing real-world uncertainties; however, no validation against flight data, higher-fidelity models, or domain-expert failure scenarios is indicated, which is load-bearing for the transferability of the hybrid superiority result.

minor comments (1)

The abstract could benefit from including at least one key quantitative result to support the claims of superior performance.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback. We address the two major comments on the abstract below and indicate planned revisions.

read point-by-point responses

Referee: Abstract: The abstract asserts superior performance on application-specific metrics and robustness but provides no quantitative results, error bars, training details, or specific metrics, making the central claim impossible to evaluate from the given text.

Authors: We agree that the abstract would be strengthened by quantitative support. The revised version will incorporate key results including mean angle-of-attack tracking error, robustness metrics under mass/inertia/bandwidth variations, and a brief reference to dynamics randomization during training, with error bars where space permits. revision: yes
Referee: Abstract: The robustness claims depend on the simulation environment and predefined operational envelope capturing real-world uncertainties; however, no validation against flight data, higher-fidelity models, or domain-expert failure scenarios is indicated, which is load-bearing for the transferability of the hybrid superiority result.

Authors: The work is a simulation study that defines an operational envelope and uses dynamics randomization to promote generalization within it; no flight-data or higher-fidelity validation is performed. We will revise the abstract and add a limitations paragraph that explicitly states the simulation scope and the assumptions underlying the envelope, thereby clarifying the transferability boundary without overstating the results. revision: partial

Circularity Check

0 steps flagged

No circularity; empirical simulation comparison against external baseline

full rationale

The paper reports an empirical evaluation of RL, hybrid RL+PID, and PID controllers on angle-of-attack tracking and robustness metrics inside a predefined simulation envelope that includes dynamics randomization during training. All performance numbers are obtained by direct rollout comparison to an industry-standard PID baseline; no equations, fitted parameters, or self-citations are invoked to derive the superiority claim. The operational envelope is an explicit experimental design choice rather than a self-referential definition, and the central results remain falsifiable by external simulation or flight data.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review provides no explicit free parameters, axioms, or invented entities; simulation fidelity and operational envelope are implicit modeling choices but not detailed.

pith-pipeline@v0.9.1-grok · 5726 in / 938 out tokens · 21338 ms · 2026-07-01T06:12:54.475236+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

161 extracted references · 64 canonical work pages · 10 internal anchors

[1]

International Journal of Robotics Research , volume =

OpenAI: Marcin Andrychowicz and Bowen Baker and Maciek Chociej and Rafal Józefowicz and Bob McGrew and Jakub Pachocki and Arthur Petron and Matthias Plappert and Glenn Powell and Alex Ray and Jonas Schneider and Szymon Sidor and Josh Tobin and Peter Welinder and Lilian Weng and Wojciech Zaremba , title =. International Journal of Robotics Research , volum...

2020
[2]

CoRR , volume =

Rika Antonova and Silvia Cruciani and Christian Smith and Danica Kragic , title =. CoRR , volume =. 2017 , url =

2017
[3]

Robotics: Science and Systems , YEAR =

Jie Tan AND Tingnan Zhang AND Erwin Coumans AND Atil Iscen AND Yunfei Bai AND Danijar Hafner AND Steven Bohez AND Vincent Vanhoucke , TITLE =. Robotics: Science and Systems , YEAR =
[4]

Sim-to-Real Transfer of Robotic Control with Dynamics Randomization , year=

Peng, Xue Bin and Andrychowicz, Marcin and Zaremba, Wojciech and Abbeel, Pieter , booktitle=. Sim-to-Real Transfer of Robotic Control with Dynamics Randomization , year=
[5]

Root Mean Square Layer Normalization , url =

Zhang, Biao and Sennrich, Rico , booktitle =. Root Mean Square Layer Normalization , url =
[6]

2019 , url =

Johannink, Tobias and Bahl, Shikhar and Nair, Ashvin and Luo, Jianlan and Kumar, Avinash and Loskyll, Matthias and Ojea, Juan Aparicio and Solowjow, Eugen and Levine, Sergey , title =. 2019 , url =. doi:10.1109/ICRA.2019.8794127 , booktitle =

work page doi:10.1109/icra.2019.8794127 2019
[7]

International Conference on Learning Representations , year=

High-Dimensional Continuous Control Using Generalized Advantage Estimation , author=. International Conference on Learning Representations , year=
[8]

2018 , number =

Joan Solà and Jérémie Deray and Dinesh Atchuthan , title =. 2018 , number =

2018
[9]

Journal of Machine Learning Research , year =

Zintgraf, Luisa and Schulze, Sebastian and Lu, Cong and Feng, Leo and Igl, Maximilian and Shiarlis, Kyriacos and Gal, Yarin and Hofmann, Katja and Whiteson, Shimon , title =. Journal of Machine Learning Research , year =
[10]

International Conference on Machine Learning , pages =

Hard Tasks First: Multi-Task Reinforcement Learning Through Task Scheduling , author =. International Conference on Machine Learning , pages =. 2024 , volume =

2024
[11]

Journal of Machine Learning Research , year =

Alexander Fabisch and Jan Hendrik Metzen , title =. Journal of Machine Learning Research , year =
[12]

International Conference on Learning Representations , year=

Learning to Multi-Task by Active Sampling , author=. International Conference on Learning Representations , year=
[13]

Discounted

Kocsis, Levente and Szepesvári, Csaba , booktitle =. Discounted
[14]

International Conference on Algorithmic Learning Theory , year = 2011, pages =

On Upper-Confidence Bound Policies for Switching Bandit Problems , author =. International Conference on Algorithmic Learning Theory , year = 2011, pages =

2011
[15]

Contextualize Me

Carolin Benjamins and Theresa Eimer and Frederik Schubert and Aditya Mohan and Sebastian D. Contextualize Me. EWRL , year=
[16]

Contextual Markov Decision Processes

Assaf Hallak and Dotan Di Castro and Shie Mannor , year=. Contextual. CoRR , volume=. 1502.02259 , archivePrefix=

work page internal anchor Pith review Pith/arXiv arXiv
[17]

Nature , pages =

Hafner, Danijar and Pasukonis, Jurgis and Ba, Jimmy and Lillicrap, Timothy , title =. Nature , pages =. doi:10.1038/s41586-025-08744-2 , year =

work page doi:10.1038/s41586-025-08744-2
[18]

2024 , url=

Nicklas Hansen and Hao Su and Xiaolong Wang , booktitle=. 2024 , url=

2024
[19]

Deep Reinforcement Learning at the Edge of the Statistical Precipice , url =

Agarwal, Rishabh and Schwarzer, Max and Castro, Pablo Samuel and Courville, Aaron C and Bellemare, Marc , booktitle =. Deep Reinforcement Learning at the Edge of the Statistical Precipice , url =
[20]

International Conference on Machine Learning , year=

Hyperspherical Normalization for Scalable Deep Reinforcement Learning , author=. International Conference on Machine Learning , year=
[21]

Bigger, Regularized, Optimistic: scaling for compute and sample efficient continuous control , url =

Nauman, Michal and Ostaszewski, Mateusz and Jankowski, Krzysztof and Mi o\'. Bigger, Regularized, Optimistic: scaling for compute and sample efficient continuous control , url =. Advances in Neural Information Processing Systems , pages =
[22]

International Conference on Learning Representations , year=

CrossQ: Batch Normalization in Deep Reinforcement Learning for Greater Sample Efficiency and Simplicity , author=. International Conference on Learning Representations , year=
[23]

Takuya Hiraoka and Takahisa Imagawa and Taisei Hashimoto and Takashi Onishi and Yoshimasa Tsuruoka , booktitle=. Dropout. 2022 , url=

2022
[24]

International Conference on Learning Representations , year=

Towards General-Purpose Model-Free Reinforcement Learning , author=. International Conference on Learning Representations , year=
[25]

For SALE: State-Action Representation Learning for Deep Reinforcement Learning , url =

Fujimoto, Scott and Chang, Wei-Di and Smith, Edward and Gu, Shixiang (Shane) and Precup, Doina and Meger, David , booktitle =. For SALE: State-Action Representation Learning for Deep Reinforcement Learning , url =
[26]

An Equivalence between Loss Functions and Non-Uniform Sampling in Experience Replay , url =

Fujimoto, Scott and Meger, David and Precup, Doina , booktitle =. An Equivalence between Loss Functions and Non-Uniform Sampling in Experience Replay , url =
[27]

International Conference on Machine Learning , pages =

Addressing Function Approximation Error in Actor-Critic Methods , author =. International Conference on Machine Learning , pages =. 2018 , volume =

2018
[28]

Towers, Mark and Kwiatkowski, Ariel and Terry, Jordan and Balis, John U. and Cola, Gianluca De and Deleu, Tristan and Goulão, Manuel and Kallinteris, Andreas and Krimmel, Markus and KG, Arjun and Perez-Vicente, Rodrigo and Pierré, Andrea and Schulhoff, Sander and Tai, Jun Jet and Tan, Hannah and Younis, Omar G. , year =. Gymnasium: A Standard Interface fo...

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2407.17032
[29]

Towers, Mark and Kwiatkowski, Ariel and Terry, Jordan K and Balis, John U. and de Cola, Gianluca and Deleu, Tristan and Goulão, Manuel and Kallinteris, Andreas and Krimmel, Markus and KG, Arjun and Perez-Vicente, Rodrigo and Pierré, Andrea and Schulhoff, Sander and Tai, Jun Jet and Tan, Hannah Jin Shen and Younis, Omar G. , month =. Gymnasium: A Standard ...
[30]

and Calandra, Roberto , month =

Pineda, Luis and Amos, Brandon and Zhang, Amy and Lambert, Nathan O. and Calandra, Roberto , month =. MBRL-Lib: A Modular Library for Model-based Reinforcement Learning , shorttitle =. 2021 , note =. doi:10.48550/arXiv.2104.10159 , abstract =

work page doi:10.48550/arxiv.2104.10159 2021
[31]

RLlib: Abstractions for Distributed Reinforcement Learning , shorttitle =

Liang, Eric and Liaw, Richard and Nishihara, Robert and Moritz, Philipp and Fox, Roy and Goldberg, Ken and Gonzalez, Joseph and Jordan, Michael and Stoica, Ion , month =. RLlib: Abstractions for Distributed Reinforcement Learning , shorttitle =. International Conference on Machine Learning , publisher =. 2018 , note =

2018
[32]

Journal of Machine Learning Research , author =

Stable-Baselines3: Reliable Reinforcement Learning Implementations , volume =. Journal of Machine Learning Research , author =. 2021 , pages =

2021
[33]

and Potter, Donald K

Orwat, Joseph C. and Potter, Donald K. , year =. Application of the Extended Kalman Filter to Ballistic Trajectory Estimation and Prediction , url =
[34]

Journal of Aerospace Information Systems , author =

Bridging Reinforcement Learning and Online Learning for Spacecraft Attitude Control , volume =. Journal of Aerospace Information Systems , author =. 2021 , pages =. doi:10.2514/1.I010958 , number =

work page doi:10.2514/1.i010958 2021
[35]

Applied Sciences , author =

A Survey on Design and Control of Lower Extremity Exoskeletons for Bipedal Walking , volume =. Applied Sciences , author =. 2022 , file =. doi:10.3390/app12052395 , number =

work page doi:10.3390/app12052395 2022
[36]

International Conference on Neural Information Processing Systems , author =

Deep reinforcement learning in a handful of trials using probabilistic dynamics models , url =. International Conference on Neural Information Processing Systems , author =. 2018 , pages =

2018
[37]

Journal of LatinX in AI (LXAI) Research , author =

Terrain Classification Enhanced with Uncertainty for Space Exploration Robots from Proprioceptive Data , volume =. Journal of LatinX in AI (LXAI) Research , author =. 2023 , file =

2023
[38]

A New Extension of the Kalman Filter to Nonlinear Systems , doi =

Julier, Simon J and Uhlmann, Jeﬀrey K , year =. A New Extension of the Kalman Filter to Nonlinear Systems , doi =
[39]

International Conference on Machine Learning , author =

Learning Latent Dynamics for Planning from Pixels , abstract =. International Conference on Machine Learning , author =. 2019 , file =

2019
[40]

International Conference on Machine Learning , author =

BOHB: Robust and Efficient Hyperparameter Optimization at Scale , abstract =. International Conference on Machine Learning , author =. 2018 , file =

2018
[41]

Journal of Machine Learning Research , author =

Active Contextual Policy Search , volume =. Journal of Machine Learning Research , author =. 2014 , pages =

2014
[42]

Fixed-Time Fault-Tolerant Optimal Attitude Control of Spacecraft With Performance Constraint via Reinforcement Learning , journal=

Xiao, Bing and Zhang, Haichao and Chen, Zhaoyue and Cao, Lu , year =. Fixed-Time Fault-Tolerant Optimal Attitude Control of Spacecraft With Performance Constraint via Reinforcement Learning , journal=. doi:10.1109/TAES.2023.3292809 , number =

work page doi:10.1109/taes.2023.3292809 2023
[43]

International Conference on Hybrid Systems: Computation and Control , author =

A few lessons learned in reinforcement learning for quadcopter attitude control , doi =. International Conference on Hybrid Systems: Computation and Control , author =
[44]

International Conference on Neural Information Processing Systems , author =

When to trust your model: model-based policy optimization , url =. International Conference on Neural Information Processing Systems , author =. 2019 , file =

2019
[45]

AI , author =

RIANN-A Robust Neural Network Outperforms Attitude Estimation Filters , volume =. AI , author =. 2021 , pages =. doi:10.3390/ai2030028 , number =

work page doi:10.3390/ai2030028 2021
[46]

Industrial Robot: the international journal of robotics research and application , author =

Model-based deep reinforcement learning with heuristic search for satellite attitude control , volume =. Industrial Robot: the international journal of robotics research and application , author =. 2018 , note =. doi:10.1108/IR-05-2018-0086 , abstract =

work page doi:10.1108/ir-05-2018-0086 2018
[47]

IEEE Transactions on Control Systems Technology , author =

Reinforcement Learning-Based Approximate Optimal Control for Attitude Reorientation Under State Constraints , volume =. IEEE Transactions on Control Systems Technology , author =. 2021 , note =. doi:10.1109/TCST.2020.3007401 , abstract =

work page doi:10.1109/tcst.2020.3007401 2021
[48]

IEEE Transactions on Neural Networks and Learning Systems , author =

Data-Efficient Deep Reinforcement Learning for Attitude Control of Fixed-Wing. IEEE Transactions on Neural Networks and Learning Systems , author =. 2024 , pages =. doi:10.1109/TNNLS.2023.3263430 , number =

work page doi:10.1109/tnnls.2023.3263430 2024
[49]

Reinforcement learning with formal performance metrics for quadcopter attitude control under non-nominal contexts , volume =

Bernini, Nicola and Bessa, Mikhail and Delmas, Rémi and Gold, Arthur and Goubault, Eric and Pennec, Romain and Putot, Sylvie and Sillion, François , year =. Reinforcement learning with formal performance metrics for quadcopter attitude control under non-nominal contexts , volume =. doi:10.1016/j.engappai.2023.107090 , journal =

work page doi:10.1016/j.engappai.2023.107090 2023
[50]

Acta Astronautica , author =

Solar Orbiter fine pointing Mode improvement in flight: Challenges and achievements , volume =. Acta Astronautica , author =. 2023 , keywords =. doi:10.1016/j.actaastro.2023.09.016 , abstract =

work page doi:10.1016/j.actaastro.2023.09.016 2023
[51]

Mechatronics , author =

A survey on modularity and distributivity in series-parallel hybrid robots , volume =. Mechatronics , author =. 2020 , keywords =. doi:10.1016/j.mechatronics.2020.102367 , abstract =

work page doi:10.1016/j.mechatronics.2020.102367 2020
[52]

A Survey of Behavior Learning Applications in Robotics -- State of the Art and Perspectives , url =

Fabisch, Alexander and Petzoldt, Christoph and Otto, Marc and Kirchner, Frank , month =. A Survey of Behavior Learning Applications in Robotics -- State of the Art and Perspectives , url =. 2024 , note =. doi:10.48550/arXiv.1906.01868 , abstract =

work page doi:10.48550/arxiv.1906.01868 2024
[53]

Addressing Function Approximation Error in Actor-Critic Methods

Fujimoto, Scott and Hoof, Herke van and Meger, David , month =. Addressing Function Approximation Error in Actor -Critic Methods , url =. 2018 , note =. doi:10.48550/arXiv.1802.09477 , abstract =

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.1802.09477 2018
[54]

International Conference on Machine Learning , pages =

Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor , author =. International Conference on Machine Learning , pages =. 2018 , volume =

2018
[55]

Proximal Policy Optimization Algorithms

Schulman, John and Wolski, Filip and Dhariwal, Prafulla and Radford, Alec and Klimov, Oleg , month =. Proximal Policy Optimization Algorithms , url =. 2017 , note =. doi:10.48550/arXiv.1707.06347 , abstract =

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.1707.06347 2017
[56]

IEEE Transactions on Instrumentation and Measurement , author =

Deep-Learning -Based Neural Network Training for State Estimation Enhancement: Application to Attitude Estimation , volume =. IEEE Transactions on Instrumentation and Measurement , author =. 2020 , note =. doi:10.1109/TIM.2019.2895495 , abstract =

work page doi:10.1109/tim.2019.2895495 2020
[57]

In: 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)

Deep Adversarial Reinforcement Learning for Object Disentangling , url =. 2020 IEEE RSJ International Conference on Intelligent Robots and Systems (IROS) , author =. 2020 , note =. doi:10.1109/IROS45743.2020.9341578 , abstract =

work page doi:10.1109/iros45743.2020.9341578 2020
[58]

Grasping 3D Deformable Objects via Reinforcement Learning: A Benchmark and Evaluation , url =

Laux, Melvin and Singh, Chandandeep and Fabisch, Alexander , year =. Grasping 3D Deformable Objects via Reinforcement Learning: A Benchmark and Evaluation , url =
[59]

2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) , author =

Learning Deep Features for Discriminative Localization , url =. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) , author =. 2016 , note =. doi:10.1109/CVPR.2016.319 , abstract =

work page doi:10.1109/cvpr.2016.319 2016
[60]

and Lee, Su-In , month =

Lundberg, Scott M. and Lee, Su-In , month =. A unified approach to interpreting model predictions , isbn =. International Conference on Neural Information Processing Systems , publisher =. 2017 , pages =

2017
[61]

Learning Deep Features for Discriminative Localization

Zhou, Bolei and Khosla, Aditya and Lapedriza, Agata and Oliva, Aude and Torralba, Antonio , month =. Learning Deep Features for Discriminative Localization , url =. 2015 , note =. doi:10.48550/arXiv.1512.04150 , abstract =

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.1512.04150 2015
[62]

International Journal of Computer Vision , author =

Grad-CAM: Visual Explanations from Deep Networks via Gradient -Based Localization , volume =. International Journal of Computer Vision , author =. 2020 , keywords =. doi:10.1007/s11263-019-01228-7 , abstract =

work page doi:10.1007/s11263-019-01228-7 2020
[63]

Perturbation-Based Explanations of Prediction Models , isbn =

Robnik-Šikonja, Marko and Bohanec, Marko , editor =. Perturbation-Based Explanations of Prediction Models , isbn =. Human and Machine Learning: Visible, Explainable, Trustworthy and Transparent , publisher =. 2018 , doi =

2018
[64]

2019 14th IEEE International Conference on Electronic Measurement & Instruments (ICEMI) , author =

Attitude estimation based on recurrent neural network and vector observations for attitude and heading reference system , url =. 2019 14th IEEE International Conference on Electronic Measurement & Instruments (ICEMI) , author =. 2019 , keywords =. doi:10.1109/ICEMI46757.2019.9101833 , abstract =

work page doi:10.1109/icemi46757.2019.9101833 2019
[65]

Scientific Reports , author =

Physics-informed neural ODE (PINODE): embedding physics into models using collocation points , volume =. Scientific Reports , author =. 2023 , note =. doi:10.1038/s41598-023-36799-6 , language =

work page doi:10.1038/s41598-023-36799-6 2023
[66]

IEEE Aerospace and Electronic Systems Magazine , author =

On Computational Complexity Reduction Methods for Kalman Filter Extensions , volume =. IEEE Aerospace and Electronic Systems Magazine , author =. 2019 , note =. doi:10.1109/MAES.2019.2927898 , abstract =

work page doi:10.1109/maes.2019.2927898 2019
[67]

Multiobjective tree-structured parzen estimator for computationally expensive optimization problems , isbn =

Ozaki, Yoshihiko and Tanigaki, Yuki and Watanabe, Shuhei and Onishi, Masaki , month =. Multiobjective tree-structured parzen estimator for computationally expensive optimization problems , isbn =. Genetic and Evolutionary Computation Conference , publisher =. 2020 , pages =. doi:10.1145/3377930.3389817 , abstract =

work page doi:10.1145/3377930.3389817 2020
[68]

PLOS ONE , author =

Unscented Kalman Filter-Trained Neural Networks for Slip Model Prediction , volume =. PLOS ONE , author =. 2016 , note =. doi:10.1371/journal.pone.0158492 , abstract =

work page doi:10.1371/journal.pone.0158492 2016
[69]

Autonomous Robots , author =

How to train your differentiable filter , volume =. Autonomous Robots , author =. 2021 , keywords =. doi:10.1007/s10514-021-09990-9 , abstract =

work page doi:10.1007/s10514-021-09990-9 2021
[70]

IFAC Proceedings Volumes , author =

NEURAL NETWORK AUGMENTATION OF ATTITUDE ESTIMATION USING NAVIGATION SATELLITE SIGNAL PHASE , volume =. IFAC Proceedings Volumes , author =. 2007 , keywords =. doi:10.3182/20070829-3-RU-4911.00059 , abstract =

work page doi:10.3182/20070829-3-ru-4911.00059 2007
[71]

Accelerating neuroevolutionary methods using a Kalman filter , isbn =

Kassahun, Yohannes and de Gea, Jose and Edgington, Mark and Metzen, Jan Hendrik and Kirchner, Frank , month =. Accelerating neuroevolutionary methods using a Kalman filter , isbn =. Annual Conference on Genetic and Evolutionary Computation , publisher =. 2008 , pages =. doi:10.1145/1389095.1389365 , abstract =

work page doi:10.1145/1389095.1389365 2008
[72]

Nature Reviews Physics , author =

Physics-informed machine learning , volume =. Nature Reviews Physics , author =. 2021 , note =. doi:10.1038/s42254-021-00314-5 , abstract =

work page doi:10.1038/s42254-021-00314-5 2021
[73]

Differentiable Particle Filters: End-to-End Learning with Algorithmic Priors , isbn =

Jonschkowski, Rico and Rastogi, Divyam and Brock, Oliver , month =. Differentiable Particle Filters: End-to-End Learning with Algorithmic Priors , isbn =. Robotics: Science and Systems XIV , publisher =. 2018 , file =. doi:10.15607/RSS.2018.XIV.001 , abstract =

work page doi:10.15607/rss.2018.xiv.001 2018
[74]

2020 IEEE 23rd International Conference on Intelligent Transportation Systems (ITSC) , author =

Physics Informed Deep Learning for Traffic State Estimation , url =. 2020 IEEE 23rd International Conference on Intelligent Transportation Systems (ITSC) , author =. 2020 , keywords =. doi:10.1109/ITSC45102.2020.9294236 , abstract =

work page doi:10.1109/itsc45102.2020.9294236 2020
[75]

IEEE Transactions on Automatic Control , author =

Cubature Kalman Filters , volume =. IEEE Transactions on Automatic Control , author =. 2009 , pages =. doi:10.1109/TAC.2009.2019800 , number =

work page doi:10.1109/tac.2009.2019800 2009
[76]

Neural Networks , author =

Learning in compressed space , volume =. Neural Networks , author =. 2013 , keywords =. doi:10.1016/j.neunet.2013.01.020 , abstract =

work page doi:10.1016/j.neunet.2013.01.020 2013
[77]

AN APPROACH TO TARGET TRACKING , shorttitle =

Gruber, Michael , month =. AN APPROACH TO TARGET TRACKING , shorttitle =. 1967 , doi =

1967
[78]

ISPRS Journal of Photogrammetry and Remote Sensing , author =

HE2LM -AD: Hierarchical and efficient attitude determination framework with adaptive error compensation module based on ELM network , volume =. ISPRS Journal of Photogrammetry and Remote Sensing , author =. 2023 , keywords =. doi:10.1016/j.isprsjprs.2022.12.010 , abstract =

work page doi:10.1016/j.isprsjprs.2022.12.010 2023
[79]

IEEE Transactions on Circuits and Systems I: Regular Papers , author =

Multi-Objective Surrogate-Model-Based Neural Architecture and Physical Design Co-Optimization of Energy Efficient Neural Network Hardware Accelerators , volume =. IEEE Transactions on Circuits and Systems I: Regular Papers , author =. 2023 , note =. doi:10.1109/TCSI.2022.3209574 , abstract =

work page doi:10.1109/tcsi.2022.3209574 2023
[80]

PerSim: Perception for Planetary Prospection and Internal Simulation , url =

Domínguez, Raúl and De Lucas Alvarez, Mariela and Kadwe, Siddhant and Shette, Siddhant and Herztberg, Christoph and Cedric Danter, Leon and Jankovik, Marko and Vyas, Shubham and Eisenmenger, Jonas and Willenbrock, Pierre and Felmet, André and Unnithan, Vikram and Kirchner, Frank , month =. PerSim: Perception for Planetary Prospection and Internal Simulati...

work page doi:10.5281/zenodo.10634584 2023

Showing first 80 references.

[1] [1]

International Journal of Robotics Research , volume =

OpenAI: Marcin Andrychowicz and Bowen Baker and Maciek Chociej and Rafal Józefowicz and Bob McGrew and Jakub Pachocki and Arthur Petron and Matthias Plappert and Glenn Powell and Alex Ray and Jonas Schneider and Szymon Sidor and Josh Tobin and Peter Welinder and Lilian Weng and Wojciech Zaremba , title =. International Journal of Robotics Research , volum...

2020

[2] [2]

CoRR , volume =

Rika Antonova and Silvia Cruciani and Christian Smith and Danica Kragic , title =. CoRR , volume =. 2017 , url =

2017

[3] [3]

Robotics: Science and Systems , YEAR =

Jie Tan AND Tingnan Zhang AND Erwin Coumans AND Atil Iscen AND Yunfei Bai AND Danijar Hafner AND Steven Bohez AND Vincent Vanhoucke , TITLE =. Robotics: Science and Systems , YEAR =

[4] [4]

Sim-to-Real Transfer of Robotic Control with Dynamics Randomization , year=

Peng, Xue Bin and Andrychowicz, Marcin and Zaremba, Wojciech and Abbeel, Pieter , booktitle=. Sim-to-Real Transfer of Robotic Control with Dynamics Randomization , year=

[5] [5]

Root Mean Square Layer Normalization , url =

Zhang, Biao and Sennrich, Rico , booktitle =. Root Mean Square Layer Normalization , url =

[6] [6]

2019 , url =

Johannink, Tobias and Bahl, Shikhar and Nair, Ashvin and Luo, Jianlan and Kumar, Avinash and Loskyll, Matthias and Ojea, Juan Aparicio and Solowjow, Eugen and Levine, Sergey , title =. 2019 , url =. doi:10.1109/ICRA.2019.8794127 , booktitle =

work page doi:10.1109/icra.2019.8794127 2019

[7] [7]

International Conference on Learning Representations , year=

High-Dimensional Continuous Control Using Generalized Advantage Estimation , author=. International Conference on Learning Representations , year=

[8] [8]

2018 , number =

Joan Solà and Jérémie Deray and Dinesh Atchuthan , title =. 2018 , number =

2018

[9] [9]

Journal of Machine Learning Research , year =

Zintgraf, Luisa and Schulze, Sebastian and Lu, Cong and Feng, Leo and Igl, Maximilian and Shiarlis, Kyriacos and Gal, Yarin and Hofmann, Katja and Whiteson, Shimon , title =. Journal of Machine Learning Research , year =

[10] [10]

International Conference on Machine Learning , pages =

Hard Tasks First: Multi-Task Reinforcement Learning Through Task Scheduling , author =. International Conference on Machine Learning , pages =. 2024 , volume =

2024

[11] [11]

Journal of Machine Learning Research , year =

Alexander Fabisch and Jan Hendrik Metzen , title =. Journal of Machine Learning Research , year =

[12] [12]

International Conference on Learning Representations , year=

Learning to Multi-Task by Active Sampling , author=. International Conference on Learning Representations , year=

[13] [13]

Discounted

Kocsis, Levente and Szepesvári, Csaba , booktitle =. Discounted

[14] [14]

International Conference on Algorithmic Learning Theory , year = 2011, pages =

On Upper-Confidence Bound Policies for Switching Bandit Problems , author =. International Conference on Algorithmic Learning Theory , year = 2011, pages =

2011

[15] [15]

Contextualize Me

Carolin Benjamins and Theresa Eimer and Frederik Schubert and Aditya Mohan and Sebastian D. Contextualize Me. EWRL , year=

[16] [16]

Contextual Markov Decision Processes

Assaf Hallak and Dotan Di Castro and Shie Mannor , year=. Contextual. CoRR , volume=. 1502.02259 , archivePrefix=

work page internal anchor Pith review Pith/arXiv arXiv

[17] [17]

Nature , pages =

Hafner, Danijar and Pasukonis, Jurgis and Ba, Jimmy and Lillicrap, Timothy , title =. Nature , pages =. doi:10.1038/s41586-025-08744-2 , year =

work page doi:10.1038/s41586-025-08744-2

[18] [18]

2024 , url=

Nicklas Hansen and Hao Su and Xiaolong Wang , booktitle=. 2024 , url=

2024

[19] [19]

Deep Reinforcement Learning at the Edge of the Statistical Precipice , url =

Agarwal, Rishabh and Schwarzer, Max and Castro, Pablo Samuel and Courville, Aaron C and Bellemare, Marc , booktitle =. Deep Reinforcement Learning at the Edge of the Statistical Precipice , url =

[20] [20]

International Conference on Machine Learning , year=

Hyperspherical Normalization for Scalable Deep Reinforcement Learning , author=. International Conference on Machine Learning , year=

[21] [21]

Bigger, Regularized, Optimistic: scaling for compute and sample efficient continuous control , url =

Nauman, Michal and Ostaszewski, Mateusz and Jankowski, Krzysztof and Mi o\'. Bigger, Regularized, Optimistic: scaling for compute and sample efficient continuous control , url =. Advances in Neural Information Processing Systems , pages =

[22] [22]

International Conference on Learning Representations , year=

CrossQ: Batch Normalization in Deep Reinforcement Learning for Greater Sample Efficiency and Simplicity , author=. International Conference on Learning Representations , year=

[23] [23]

Takuya Hiraoka and Takahisa Imagawa and Taisei Hashimoto and Takashi Onishi and Yoshimasa Tsuruoka , booktitle=. Dropout. 2022 , url=

2022

[24] [24]

International Conference on Learning Representations , year=

Towards General-Purpose Model-Free Reinforcement Learning , author=. International Conference on Learning Representations , year=

[25] [25]

For SALE: State-Action Representation Learning for Deep Reinforcement Learning , url =

Fujimoto, Scott and Chang, Wei-Di and Smith, Edward and Gu, Shixiang (Shane) and Precup, Doina and Meger, David , booktitle =. For SALE: State-Action Representation Learning for Deep Reinforcement Learning , url =

[26] [26]

An Equivalence between Loss Functions and Non-Uniform Sampling in Experience Replay , url =

Fujimoto, Scott and Meger, David and Precup, Doina , booktitle =. An Equivalence between Loss Functions and Non-Uniform Sampling in Experience Replay , url =

[27] [27]

International Conference on Machine Learning , pages =

Addressing Function Approximation Error in Actor-Critic Methods , author =. International Conference on Machine Learning , pages =. 2018 , volume =

2018

[28] [28]

Towers, Mark and Kwiatkowski, Ariel and Terry, Jordan and Balis, John U. and Cola, Gianluca De and Deleu, Tristan and Goulão, Manuel and Kallinteris, Andreas and Krimmel, Markus and KG, Arjun and Perez-Vicente, Rodrigo and Pierré, Andrea and Schulhoff, Sander and Tai, Jun Jet and Tan, Hannah and Younis, Omar G. , year =. Gymnasium: A Standard Interface fo...

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2407.17032

[29] [29]

Towers, Mark and Kwiatkowski, Ariel and Terry, Jordan K and Balis, John U. and de Cola, Gianluca and Deleu, Tristan and Goulão, Manuel and Kallinteris, Andreas and Krimmel, Markus and KG, Arjun and Perez-Vicente, Rodrigo and Pierré, Andrea and Schulhoff, Sander and Tai, Jun Jet and Tan, Hannah Jin Shen and Younis, Omar G. , month =. Gymnasium: A Standard ...

[30] [30]

and Calandra, Roberto , month =

Pineda, Luis and Amos, Brandon and Zhang, Amy and Lambert, Nathan O. and Calandra, Roberto , month =. MBRL-Lib: A Modular Library for Model-based Reinforcement Learning , shorttitle =. 2021 , note =. doi:10.48550/arXiv.2104.10159 , abstract =

work page doi:10.48550/arxiv.2104.10159 2021

[31] [31]

RLlib: Abstractions for Distributed Reinforcement Learning , shorttitle =

Liang, Eric and Liaw, Richard and Nishihara, Robert and Moritz, Philipp and Fox, Roy and Goldberg, Ken and Gonzalez, Joseph and Jordan, Michael and Stoica, Ion , month =. RLlib: Abstractions for Distributed Reinforcement Learning , shorttitle =. International Conference on Machine Learning , publisher =. 2018 , note =

2018

[32] [32]

Journal of Machine Learning Research , author =

Stable-Baselines3: Reliable Reinforcement Learning Implementations , volume =. Journal of Machine Learning Research , author =. 2021 , pages =

2021

[33] [33]

and Potter, Donald K

Orwat, Joseph C. and Potter, Donald K. , year =. Application of the Extended Kalman Filter to Ballistic Trajectory Estimation and Prediction , url =

[34] [34]

Journal of Aerospace Information Systems , author =

Bridging Reinforcement Learning and Online Learning for Spacecraft Attitude Control , volume =. Journal of Aerospace Information Systems , author =. 2021 , pages =. doi:10.2514/1.I010958 , number =

work page doi:10.2514/1.i010958 2021

[35] [35]

Applied Sciences , author =

A Survey on Design and Control of Lower Extremity Exoskeletons for Bipedal Walking , volume =. Applied Sciences , author =. 2022 , file =. doi:10.3390/app12052395 , number =

work page doi:10.3390/app12052395 2022

[36] [36]

International Conference on Neural Information Processing Systems , author =

Deep reinforcement learning in a handful of trials using probabilistic dynamics models , url =. International Conference on Neural Information Processing Systems , author =. 2018 , pages =

2018

[37] [37]

Journal of LatinX in AI (LXAI) Research , author =

Terrain Classification Enhanced with Uncertainty for Space Exploration Robots from Proprioceptive Data , volume =. Journal of LatinX in AI (LXAI) Research , author =. 2023 , file =

2023

[38] [38]

A New Extension of the Kalman Filter to Nonlinear Systems , doi =

Julier, Simon J and Uhlmann, Jeﬀrey K , year =. A New Extension of the Kalman Filter to Nonlinear Systems , doi =

[39] [39]

International Conference on Machine Learning , author =

Learning Latent Dynamics for Planning from Pixels , abstract =. International Conference on Machine Learning , author =. 2019 , file =

2019

[40] [40]

International Conference on Machine Learning , author =

BOHB: Robust and Efficient Hyperparameter Optimization at Scale , abstract =. International Conference on Machine Learning , author =. 2018 , file =

2018

[41] [41]

Journal of Machine Learning Research , author =

Active Contextual Policy Search , volume =. Journal of Machine Learning Research , author =. 2014 , pages =

2014

[42] [42]

Fixed-Time Fault-Tolerant Optimal Attitude Control of Spacecraft With Performance Constraint via Reinforcement Learning , journal=

Xiao, Bing and Zhang, Haichao and Chen, Zhaoyue and Cao, Lu , year =. Fixed-Time Fault-Tolerant Optimal Attitude Control of Spacecraft With Performance Constraint via Reinforcement Learning , journal=. doi:10.1109/TAES.2023.3292809 , number =

work page doi:10.1109/taes.2023.3292809 2023

[43] [43]

International Conference on Hybrid Systems: Computation and Control , author =

A few lessons learned in reinforcement learning for quadcopter attitude control , doi =. International Conference on Hybrid Systems: Computation and Control , author =

[44] [44]

International Conference on Neural Information Processing Systems , author =

When to trust your model: model-based policy optimization , url =. International Conference on Neural Information Processing Systems , author =. 2019 , file =

2019

[45] [45]

AI , author =

RIANN-A Robust Neural Network Outperforms Attitude Estimation Filters , volume =. AI , author =. 2021 , pages =. doi:10.3390/ai2030028 , number =

work page doi:10.3390/ai2030028 2021

[46] [46]

Industrial Robot: the international journal of robotics research and application , author =

Model-based deep reinforcement learning with heuristic search for satellite attitude control , volume =. Industrial Robot: the international journal of robotics research and application , author =. 2018 , note =. doi:10.1108/IR-05-2018-0086 , abstract =

work page doi:10.1108/ir-05-2018-0086 2018

[47] [47]

IEEE Transactions on Control Systems Technology , author =

Reinforcement Learning-Based Approximate Optimal Control for Attitude Reorientation Under State Constraints , volume =. IEEE Transactions on Control Systems Technology , author =. 2021 , note =. doi:10.1109/TCST.2020.3007401 , abstract =

work page doi:10.1109/tcst.2020.3007401 2021

[48] [48]

IEEE Transactions on Neural Networks and Learning Systems , author =

Data-Efficient Deep Reinforcement Learning for Attitude Control of Fixed-Wing. IEEE Transactions on Neural Networks and Learning Systems , author =. 2024 , pages =. doi:10.1109/TNNLS.2023.3263430 , number =

work page doi:10.1109/tnnls.2023.3263430 2024

[49] [49]

Reinforcement learning with formal performance metrics for quadcopter attitude control under non-nominal contexts , volume =

Bernini, Nicola and Bessa, Mikhail and Delmas, Rémi and Gold, Arthur and Goubault, Eric and Pennec, Romain and Putot, Sylvie and Sillion, François , year =. Reinforcement learning with formal performance metrics for quadcopter attitude control under non-nominal contexts , volume =. doi:10.1016/j.engappai.2023.107090 , journal =

work page doi:10.1016/j.engappai.2023.107090 2023

[50] [50]

Acta Astronautica , author =

Solar Orbiter fine pointing Mode improvement in flight: Challenges and achievements , volume =. Acta Astronautica , author =. 2023 , keywords =. doi:10.1016/j.actaastro.2023.09.016 , abstract =

work page doi:10.1016/j.actaastro.2023.09.016 2023

[51] [51]

Mechatronics , author =

A survey on modularity and distributivity in series-parallel hybrid robots , volume =. Mechatronics , author =. 2020 , keywords =. doi:10.1016/j.mechatronics.2020.102367 , abstract =

work page doi:10.1016/j.mechatronics.2020.102367 2020

[52] [52]

A Survey of Behavior Learning Applications in Robotics -- State of the Art and Perspectives , url =

Fabisch, Alexander and Petzoldt, Christoph and Otto, Marc and Kirchner, Frank , month =. A Survey of Behavior Learning Applications in Robotics -- State of the Art and Perspectives , url =. 2024 , note =. doi:10.48550/arXiv.1906.01868 , abstract =

work page doi:10.48550/arxiv.1906.01868 2024

[53] [53]

Addressing Function Approximation Error in Actor-Critic Methods

Fujimoto, Scott and Hoof, Herke van and Meger, David , month =. Addressing Function Approximation Error in Actor -Critic Methods , url =. 2018 , note =. doi:10.48550/arXiv.1802.09477 , abstract =

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.1802.09477 2018

[54] [54]

International Conference on Machine Learning , pages =

Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor , author =. International Conference on Machine Learning , pages =. 2018 , volume =

2018

[55] [55]

Proximal Policy Optimization Algorithms

Schulman, John and Wolski, Filip and Dhariwal, Prafulla and Radford, Alec and Klimov, Oleg , month =. Proximal Policy Optimization Algorithms , url =. 2017 , note =. doi:10.48550/arXiv.1707.06347 , abstract =

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.1707.06347 2017

[56] [56]

IEEE Transactions on Instrumentation and Measurement , author =

Deep-Learning -Based Neural Network Training for State Estimation Enhancement: Application to Attitude Estimation , volume =. IEEE Transactions on Instrumentation and Measurement , author =. 2020 , note =. doi:10.1109/TIM.2019.2895495 , abstract =

work page doi:10.1109/tim.2019.2895495 2020

[57] [57]

In: 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)

Deep Adversarial Reinforcement Learning for Object Disentangling , url =. 2020 IEEE RSJ International Conference on Intelligent Robots and Systems (IROS) , author =. 2020 , note =. doi:10.1109/IROS45743.2020.9341578 , abstract =

work page doi:10.1109/iros45743.2020.9341578 2020

[58] [58]

Grasping 3D Deformable Objects via Reinforcement Learning: A Benchmark and Evaluation , url =

Laux, Melvin and Singh, Chandandeep and Fabisch, Alexander , year =. Grasping 3D Deformable Objects via Reinforcement Learning: A Benchmark and Evaluation , url =

[59] [59]

2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) , author =

Learning Deep Features for Discriminative Localization , url =. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) , author =. 2016 , note =. doi:10.1109/CVPR.2016.319 , abstract =

work page doi:10.1109/cvpr.2016.319 2016

[60] [60]

and Lee, Su-In , month =

Lundberg, Scott M. and Lee, Su-In , month =. A unified approach to interpreting model predictions , isbn =. International Conference on Neural Information Processing Systems , publisher =. 2017 , pages =

2017

[61] [61]

Learning Deep Features for Discriminative Localization

Zhou, Bolei and Khosla, Aditya and Lapedriza, Agata and Oliva, Aude and Torralba, Antonio , month =. Learning Deep Features for Discriminative Localization , url =. 2015 , note =. doi:10.48550/arXiv.1512.04150 , abstract =

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.1512.04150 2015

[62] [62]

International Journal of Computer Vision , author =

Grad-CAM: Visual Explanations from Deep Networks via Gradient -Based Localization , volume =. International Journal of Computer Vision , author =. 2020 , keywords =. doi:10.1007/s11263-019-01228-7 , abstract =

work page doi:10.1007/s11263-019-01228-7 2020

[63] [63]

Perturbation-Based Explanations of Prediction Models , isbn =

Robnik-Šikonja, Marko and Bohanec, Marko , editor =. Perturbation-Based Explanations of Prediction Models , isbn =. Human and Machine Learning: Visible, Explainable, Trustworthy and Transparent , publisher =. 2018 , doi =

2018

[64] [64]

2019 14th IEEE International Conference on Electronic Measurement & Instruments (ICEMI) , author =

Attitude estimation based on recurrent neural network and vector observations for attitude and heading reference system , url =. 2019 14th IEEE International Conference on Electronic Measurement & Instruments (ICEMI) , author =. 2019 , keywords =. doi:10.1109/ICEMI46757.2019.9101833 , abstract =

work page doi:10.1109/icemi46757.2019.9101833 2019

[65] [65]

Scientific Reports , author =

Physics-informed neural ODE (PINODE): embedding physics into models using collocation points , volume =. Scientific Reports , author =. 2023 , note =. doi:10.1038/s41598-023-36799-6 , language =

work page doi:10.1038/s41598-023-36799-6 2023

[66] [66]

IEEE Aerospace and Electronic Systems Magazine , author =

On Computational Complexity Reduction Methods for Kalman Filter Extensions , volume =. IEEE Aerospace and Electronic Systems Magazine , author =. 2019 , note =. doi:10.1109/MAES.2019.2927898 , abstract =

work page doi:10.1109/maes.2019.2927898 2019

[67] [67]

Multiobjective tree-structured parzen estimator for computationally expensive optimization problems , isbn =

Ozaki, Yoshihiko and Tanigaki, Yuki and Watanabe, Shuhei and Onishi, Masaki , month =. Multiobjective tree-structured parzen estimator for computationally expensive optimization problems , isbn =. Genetic and Evolutionary Computation Conference , publisher =. 2020 , pages =. doi:10.1145/3377930.3389817 , abstract =

work page doi:10.1145/3377930.3389817 2020

[68] [68]

PLOS ONE , author =

Unscented Kalman Filter-Trained Neural Networks for Slip Model Prediction , volume =. PLOS ONE , author =. 2016 , note =. doi:10.1371/journal.pone.0158492 , abstract =

work page doi:10.1371/journal.pone.0158492 2016

[69] [69]

Autonomous Robots , author =

How to train your differentiable filter , volume =. Autonomous Robots , author =. 2021 , keywords =. doi:10.1007/s10514-021-09990-9 , abstract =

work page doi:10.1007/s10514-021-09990-9 2021

[70] [70]

IFAC Proceedings Volumes , author =

NEURAL NETWORK AUGMENTATION OF ATTITUDE ESTIMATION USING NAVIGATION SATELLITE SIGNAL PHASE , volume =. IFAC Proceedings Volumes , author =. 2007 , keywords =. doi:10.3182/20070829-3-RU-4911.00059 , abstract =

work page doi:10.3182/20070829-3-ru-4911.00059 2007

[71] [71]

Accelerating neuroevolutionary methods using a Kalman filter , isbn =

Kassahun, Yohannes and de Gea, Jose and Edgington, Mark and Metzen, Jan Hendrik and Kirchner, Frank , month =. Accelerating neuroevolutionary methods using a Kalman filter , isbn =. Annual Conference on Genetic and Evolutionary Computation , publisher =. 2008 , pages =. doi:10.1145/1389095.1389365 , abstract =

work page doi:10.1145/1389095.1389365 2008

[72] [72]

Nature Reviews Physics , author =

Physics-informed machine learning , volume =. Nature Reviews Physics , author =. 2021 , note =. doi:10.1038/s42254-021-00314-5 , abstract =

work page doi:10.1038/s42254-021-00314-5 2021

[73] [73]

Differentiable Particle Filters: End-to-End Learning with Algorithmic Priors , isbn =

Jonschkowski, Rico and Rastogi, Divyam and Brock, Oliver , month =. Differentiable Particle Filters: End-to-End Learning with Algorithmic Priors , isbn =. Robotics: Science and Systems XIV , publisher =. 2018 , file =. doi:10.15607/RSS.2018.XIV.001 , abstract =

work page doi:10.15607/rss.2018.xiv.001 2018

[74] [74]

2020 IEEE 23rd International Conference on Intelligent Transportation Systems (ITSC) , author =

Physics Informed Deep Learning for Traffic State Estimation , url =. 2020 IEEE 23rd International Conference on Intelligent Transportation Systems (ITSC) , author =. 2020 , keywords =. doi:10.1109/ITSC45102.2020.9294236 , abstract =

work page doi:10.1109/itsc45102.2020.9294236 2020

[75] [75]

IEEE Transactions on Automatic Control , author =

Cubature Kalman Filters , volume =. IEEE Transactions on Automatic Control , author =. 2009 , pages =. doi:10.1109/TAC.2009.2019800 , number =

work page doi:10.1109/tac.2009.2019800 2009

[76] [76]

Neural Networks , author =

Learning in compressed space , volume =. Neural Networks , author =. 2013 , keywords =. doi:10.1016/j.neunet.2013.01.020 , abstract =

work page doi:10.1016/j.neunet.2013.01.020 2013

[77] [77]

AN APPROACH TO TARGET TRACKING , shorttitle =

Gruber, Michael , month =. AN APPROACH TO TARGET TRACKING , shorttitle =. 1967 , doi =

1967

[78] [78]

ISPRS Journal of Photogrammetry and Remote Sensing , author =

HE2LM -AD: Hierarchical and efficient attitude determination framework with adaptive error compensation module based on ELM network , volume =. ISPRS Journal of Photogrammetry and Remote Sensing , author =. 2023 , keywords =. doi:10.1016/j.isprsjprs.2022.12.010 , abstract =

work page doi:10.1016/j.isprsjprs.2022.12.010 2023

[79] [79]

IEEE Transactions on Circuits and Systems I: Regular Papers , author =

Multi-Objective Surrogate-Model-Based Neural Architecture and Physical Design Co-Optimization of Energy Efficient Neural Network Hardware Accelerators , volume =. IEEE Transactions on Circuits and Systems I: Regular Papers , author =. 2023 , note =. doi:10.1109/TCSI.2022.3209574 , abstract =

work page doi:10.1109/tcsi.2022.3209574 2023

[80] [80]

PerSim: Perception for Planetary Prospection and Internal Simulation , url =

Domínguez, Raúl and De Lucas Alvarez, Mariela and Kadwe, Siddhant and Shette, Siddhant and Herztberg, Christoph and Cedric Danter, Leon and Jankovik, Marko and Vyas, Shubham and Eisenmenger, Jonas and Willenbrock, Pierre and Felmet, André and Unnithan, Vikram and Kirchner, Frank , month =. PerSim: Perception for Planetary Prospection and Internal Simulati...

work page doi:10.5281/zenodo.10634584 2023