Learning Visually Interpretable Oscillator Networks for Soft Continuum Robots from Video

Annika Raatz; Henrik Krauss; Johann Licher; Naoya Takeishi; Takehisa Yairi

arxiv: 2511.18322 · v3 · submitted 2025-11-23 · 💻 cs.RO · cs.CV· cs.LG

Learning Visually Interpretable Oscillator Networks for Soft Continuum Robots from Video

Henrik Krauss , Johann Licher , Naoya Takeishi , Annika Raatz , Takehisa Yairi This is my paper

Pith reviewed 2026-05-17 06:37 UTC · model grok-4.3

classification 💻 cs.RO cs.CVcs.LG

keywords soft continuum robotsvideo-based dynamics learningattention decoderoscillator networkslatent dynamicsinterpretable modelsrobot predictionvisual interpretability

0 comments

The pith

ABCD attention maps and Visual Oscillator Networks produce accurate and mechanically interpretable models of soft continuum robot motion from video.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces a fully data-driven method to learn the dynamics of soft continuum robots directly from video without prior models or manual design. It adds the Attention Broadcast Decoder to autoencoders so that each latent dimension maps back to a precise pixel attention mask that highlights moving robot parts and ignores static backgrounds. It pairs this with Visual Oscillator Networks that treat the latent space as a set of coupled 2D oscillators whose masses, stiffnesses, and forces can be overlaid on the original image. On single- and double-segment robots the approach cuts multi-step prediction error by 5.8 times for Koopman models and 3.5 times for oscillator models while the networks recover a chain-like structure on their own.

Core claim

The Attention Broadcast Decoder generates pixel-accurate attention maps that localize each latent dimension's contribution and filter static backgrounds. Visual Oscillator Networks then model the robot as a 2D network of oscillators whose parameters are visualized directly on the image. On single- and double-segment soft continuum robots, ABCD-based models reduce multi-step prediction error by 5.8 times for Koopman operators and 3.5 times for oscillator networks, and the networks autonomously discover the expected chain structure of oscillators.

What carries the argument

Attention Broadcast Decoder (ABCD) that produces spatially grounded attention maps for latent dimensions, coupled to Visual Oscillator Networks (VONs) that visualize masses, coupling stiffness, and forces on the image.

If this is right

Multi-step state predictions become reliable enough for model-based control of soft robots.
The networks discover chain structures without being told the number of segments.
Both visual overlays and mechanical parameters become available for inspection and debugging.
The method requires no hand-crafted kinematic model or prior physical assumptions.
Compact latent models result that remain interpretable while improving accuracy.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same attention-plus-oscillator structure could be tested on videos of other deformable objects such as cloth or biological tissue.
The discovered chain of oscillators might be used to design modular controllers that treat each segment separately.
Adding noise or lighting variation to the training videos would test whether the attention maps remain stable.
The visualized forces and stiffnesses could serve as starting points for sim-to-real transfer in soft-robot control.

Load-bearing premise

The learned attention maps and oscillator parameters such as masses and coupling stiffness correspond to physically meaningful quantities that generalize beyond the single- and double-segment video datasets.

What would settle it

Train the model on single- and double-segment robot videos, then apply it to video of a three-segment robot under new lighting or camera angles and check whether the attention maps still align with actual moving parts and whether multi-step prediction error remains reduced.

Figures

Figures reproduced from arXiv: 2511.18322 by Annika Raatz, Henrik Krauss, Johann Licher, Naoya Takeishi, Takehisa Yairi.

**Figure 2.** Figure 2: (a) The attention broadcast decoder (ABCD) inte [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗

**Figure 3.** Figure 3: Attention maps for the 1-segment and 2-segment [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗

**Figure 4.** Figure 4: On-image 2D oscillator networks for the 1-segment and 2-segment robots (left) and latent space visualization (right). [PITH_FULL_IMAGE:figures/full_fig_p007_4.png] view at source ↗

**Figure 5.** Figure 5: shows the reconstruction error over time and steps for both 1-segment and 2-segment robots and [PITH_FULL_IMAGE:figures/full_fig_p007_5.png] view at source ↗

**Figure 6.** Figure 6: Multi-step reconstruction images (single step and every 6th step shown) over [PITH_FULL_IMAGE:figures/full_fig_p008_6.png] view at source ↗

**Figure 7.** Figure 7: Latent space extrapolation for 2-segment robot mod [PITH_FULL_IMAGE:figures/full_fig_p008_7.png] view at source ↗

read the original abstract

Learning soft continuum robot (SCR) dynamics from video offers flexibility but existing methods lack interpretability or rely on prior assumptions. Model-based approaches require prior knowledge and manual design. We bridge this gap by introducing: (1) The Attention Broadcast Decoder (ABCD), a plug-and-play module for autoencoder-based latent dynamics learning that generates pixel-accurate attention maps localizing each latent dimension's contribution while filtering static backgrounds, enabling visual interpretability via spatially grounded latents and on-image overlays. (2) Visual Oscillator Networks (VONs), a 2D latent oscillator network coupled to ABCD attention maps for on-image visualization of learned masses, coupling stiffness, and forces, enabling mechanical interpretability. We validate our approach on single- and double-segment SCRs, demonstrating that ABCD-based models significantly improve multi-step prediction accuracy with 5.8x error reduction for Koopman operators and 3.5x for oscillator networks on a two-segment robot. VONs autonomously discover a chain structure of oscillators. This fully data-driven approach yields compact, mechanically interpretable models with potential relevance for future control applications.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper introduces the Attention Broadcast Decoder (ABCD), a plug-and-play module for autoencoder-based latent dynamics models that produces pixel-accurate attention maps to localize each latent dimension's contribution to the image while suppressing static backgrounds. It further proposes Visual Oscillator Networks (VONs), which couple a 2D latent oscillator network to the ABCD maps to enable on-image visualization of learned masses, coupling stiffness, and forces. Experiments on single- and double-segment soft continuum robots demonstrate that ABCD-augmented models yield large gains in multi-step prediction accuracy (5.8x error reduction for Koopman operators and 3.5x for oscillator networks on the two-segment case) and that VONs autonomously recover a chain-like oscillator structure.

Significance. If the interpretability claims receive quantitative support, the work would offer a practical route to compact, visually and mechanically grounded latent models for soft robots learned directly from video, potentially aiding downstream control without manual kinematic assumptions.

major comments (2)

[Abstract] Abstract: the central interpretability claim—that ABCD attention maps localize latent contributions in a spatially meaningful way and that VON parameters (masses, coupling stiffness, forces) reflect actual robot mechanics—rests solely on prediction metrics and visual overlays; no quantitative recovery of known hardware parameters, energy-consistency checks, or comparison against analytical continuum models is reported, which is load-bearing for the asserted mechanical interpretability benefit.
[Abstract] Abstract: the reported 5.8x and 3.5x multi-step error reductions on the two-segment robot are presented without reference to exact baseline implementations, data splits, ablation controls, or robustness to random seeds and longer horizons; these details are required to establish that the gains are attributable to ABCD/VON rather than dataset-specific fitting.

minor comments (2)

The abstract refers to 'on-image overlays' and 'autonomous discovery of a chain structure'; the methods section should explicitly define the quantitative criterion (if any) used to declare a discovered chain structure versus incidental parameter clustering.
Notation for the oscillator parameters (m, k, f) and the precise form of the 2D latent dynamics should be introduced with equations early in the manuscript to avoid ambiguity when interpreting the visualized quantities.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback on our work. We address each major comment below and indicate the changes planned for the revised manuscript.

read point-by-point responses

Referee: [Abstract] Abstract: the central interpretability claim—that ABCD attention maps localize latent contributions in a spatially meaningful way and that VON parameters (masses, coupling stiffness, forces) reflect actual robot mechanics—rests solely on prediction metrics and visual overlays; no quantitative recovery of known hardware parameters, energy-consistency checks, or comparison against analytical continuum models is reported, which is load-bearing for the asserted mechanical interpretability benefit.

Authors: We appreciate the referee's point that stronger quantitative grounding would bolster the mechanical interpretability claims. The current results demonstrate visual interpretability via ABCD attention maps that localize dynamic contributions while suppressing backgrounds, and mechanical interpretability via VONs that autonomously recover a chain-like oscillator structure matching the two-segment robot geometry. These outcomes, together with the large multi-step prediction gains, support the utility of the approach without manual kinematic priors. We agree that direct comparisons would strengthen the manuscript and have added a new subsection with parameter comparisons to simplified analytical continuum models for the single-segment case along with a brief energy-consistency discussion for the learned VON dynamics. revision: yes
Referee: [Abstract] Abstract: the reported 5.8x and 3.5x multi-step error reductions on the two-segment robot are presented without reference to exact baseline implementations, data splits, ablation controls, or robustness to random seeds and longer horizons; these details are required to establish that the gains are attributable to ABCD/VON rather than dataset-specific fitting.

Authors: We agree that additional experimental details are needed to substantiate the reported gains. The revised manuscript expands the Experiments section with precise specifications of the baseline Koopman and oscillator implementations, the training/validation/test split ratios, ablation studies that remove ABCD or the VON coupling, performance statistics over five random seeds, and extended-horizon rollouts beyond those shown in the original figures. revision: yes

Circularity Check

0 steps flagged

No significant circularity; claims rest on empirical validation rather than definitional reduction.

full rationale

The paper introduces ABCD for attention-based latent localization and VONs for oscillator networks, reporting 5.8x and 3.5x multi-step error reductions plus autonomous chain discovery on video datasets. These outcomes are presented as results of training on single- and double-segment SCR videos, with interpretability arising from on-image overlays and parameter visualization. No equations or self-citations reduce the reported accuracy gains or structural discovery back to quantities defined by the fitted parameters themselves. The approach is self-contained as a data-driven method without load-bearing self-referential definitions or fitted-input-as-prediction patterns.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

Abstract-only view limits visibility into exact free parameters; the method appears to rest on standard autoencoder reconstruction assumptions and the premise that oscillator networks can represent continuum mechanics.

axioms (2)

domain assumption Latent dynamics learned via autoencoders can be coupled to attention maps that localize each dimension's contribution on the image plane.
Invoked when describing ABCD as generating pixel-accurate attention maps for visual interpretability.
domain assumption A 2D latent oscillator network can represent masses, coupling stiffness, and forces in a way that matches soft continuum robot behavior.
Stated when introducing VONs for mechanical interpretability and autonomous discovery of chain structure.

pith-pipeline@v0.9.0 · 5517 in / 1368 out tokens · 27114 ms · 2026-05-17T06:37:03.482170+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/AlexanderDuality.lean alexander_duality_circle_linking unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

VONs autonomously discover a chain structure of oscillators ... consistent with Cosserat rod theory.

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

25 extracted references · 25 canonical work pages

[1]

Modern koopman theory for dynamical systems,

S. L. Brunton, M. Budi ˇsi´c, E. Kaiser, and J. N. Kutz, “Modern koopman theory for dynamical systems,”SIAM Review, vol. 64, no. 2, pp. 229–340, 2022

work page 2022
[2]

Deep learning for universal linear embeddings of nonlinear dynamics,

B. Lusch, J. N. Kutz, and S. L. Brunton, “Deep learning for universal linear embeddings of nonlinear dynamics,”Nature communications, vol. 9, no. 1, p. 4950, 2018

work page 2018
[3]

Learning com- positional koopman operators for model-based control,

Y . Li, H. He, J. Wu, D. Katabi, and A. Torralba, “Learning com- positional koopman operators for model-based control,” in8th Inter- national Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, April 26-30, 2020. OpenReview.net, 2020

work page 2020
[4]

Learning koopman invariant subspaces for dynamic mode decomposition,

N. Takeishi, Y . Kawahara, and T. Yairi, “Learning koopman invariant subspaces for dynamic mode decomposition,” inAdvances in Neural Information Processing Systems, I. Guyon, U. V . Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, Eds., vol. 30. Curran Associates, Inc., 2017

work page 2017
[5]

Neural oscillators are uni- versal,

S. Lanthaler, T. K. Rusch, and S. Mishra, “Neural oscillators are uni- versal,” inAdvances in Neural Information Processing Systems, A. Oh, T. Naumann, A. Globerson, K. Saenko, M. Hardt, and S. Levine, Eds., vol. 36. Curran Associates, Inc., 2023, pp. 46 786–46 806

work page 2023
[6]

Input-to-state stable coupled oscil- lator networks for closed-form model-based control in latent space,

M. St ¨olzle and C. Della Santina, “Input-to-state stable coupled oscil- lator networks for closed-form model-based control in latent space,” inAdvances in Neural Information Processing Systems, 2024

work page 2024
[7]

Learning-Based Control Strategies for Soft Robots: Theory, Achieve- ments, and Future Challenges,

C. Laschi, T. G. Thuruthel, F. Lida, R. Merzouki, and E. Falotico, “Learning-Based Control Strategies for Soft Robots: Theory, Achieve- ments, and Future Challenges,”IEEE Control Systems Magazine, vol. 43, no. 3, pp. 100–113, 2023

work page 2023
[8]

A review of learning-based dynamics models for robotic manipulation,

B. Ai, S. Tian, H. Shi, Y . Wang, T. Pfaff, C. Tan, H. I. Christensen, H. Su, J. Wu, and Y . Li, “A review of learning-based dynamics models for robotic manipulation,”Science Robotics, vol. 10, no. 106, p. eadt1497, 2025

work page 2025
[9]

Data-Driven Control of Soft Robots Using Koopman Operator The- ory,

D. Bruder, X. Fu, R. B. Gillespie, C. D. Remy, and R. Vasudevan, “Data-Driven Control of Soft Robots Using Koopman Operator The- ory,”IEEE Transactions on Robotics, vol. 37, no. 3, pp. 948–961, 2021

work page 2021
[10]

Koopman operators for modeling and control of soft robotics,

L. Shi, Z. Liu, and K. Karydis, “Koopman operators for modeling and control of soft robotics,”Current Robotics Reports, vol. 4, no. 2, pp. 23–31, 2023

work page 2023
[11]

Control of soft robots with inertial dynamics,

D. A. Haggerty, M. J. Banks, E. Kamenar, A. B. Cao, P. C. Curtis, I. Mezi ´c, and E. W. Hawkes, “Control of soft robots with inertial dynamics,”Science Robotics, vol. 8, no. 81, p. eadd6864, 2023

work page 2023
[12]

Physics-Informed Split Koopman Operators for Data-Efficient Soft Robotic Simulation,

E. Ristich, L. Zhang, Y . Ren, and J. Sun, “Physics-Informed Split Koopman Operators for Data-Efficient Soft Robotic Simulation,” in 2025 IEEE International Conference on Robotics and Automation (ICRA), 2025, pp. 9273–9279

work page 2025
[13]

Physics-Informed Neural Net- works to Model and Control Robots: A Theoretical and Experimental Investigation,

J. Liu, P. Borja, and C. Della Santina, “Physics-Informed Neural Net- works to Model and Control Robots: A Theoretical and Experimental Investigation,”Advanced Intelligent Systems, vol. 6, no. 5, p. 2300385, 2024

work page 2024
[14]

Adaptive model-predictive control of a soft continuum robot using a physics-informed neural network based on cosserat rod theory,

J. Licher, M. Bartholdt, H. Krauss, T.-L. Habich, T. Seel, and M. Schappler, “Adaptive model-predictive control of a soft continuum robot using a physics-informed neural network based on cosserat rod theory,” 2025

work page 2025
[15]

Domain-Decoupled Physics-informed Neural Networks with Closed- Form Gradients for Fast Model Learning of Dynamical Systems,

H. Krauss, T.-L. Habich, M. Bartholdt, T. Seel, and M. Schappler, “Domain-Decoupled Physics-informed Neural Networks with Closed- Form Gradients for Fast Model Learning of Dynamical Systems,” in Proceedings of the 21st International Conference on Informatics in Control, Automation and Robotics. SCITEPRESS - Science and Technology Publications, 2024, pp. 55–66

work page 2024
[16]

Structure-preserving model order reduction of slender soft robots via autoencoder-parameterized strain,

A. Y . Alkayas, A. T. Mathew, D. Feliu-Talegon, Y . Zweiri, T. G. Thuruthel, and F. Renda, “Structure-preserving model order reduction of slender soft robots via autoencoder-parameterized strain,”IEEE Robotics and Automation Letters, vol. 10, no. 10, pp. 11 006–11 013, 2025

work page 2025
[17]

Soft synergies: Model order reduction of hybrid soft-rigid robots via optimal strain parameterization,

A. Y . Alkayas, A. T. Mathew, D. Feliu-Talegon, P. Deng, T. G. Thuruthel, and F. Renda, “Soft synergies: Model order reduction of hybrid soft-rigid robots via optimal strain parameterization,”IEEE Transactions on Robotics, vol. 41, pp. 1118–1137, 2025

work page 2025
[18]

Vision-based online key point estimation of de- formable robots,

H. Zheng, S. Pinzello, B. G. Cangan, T. J. Buchner, and R. K. Katzschmann, “Vision-based online key point estimation of de- formable robots,”Advanced Intelligent Systems, vol. 6, no. 10, p. 2400105, 2024

work page 2024
[19]

Vision-based real-time shape estimation of self- occluding soft parallel robots using neural networks,

Y . Rong and G. Gu, “Vision-based real-time shape estimation of self- occluding soft parallel robots using neural networks,”IEEE Robotics and Automation Letters, vol. 9, no. 8, pp. 7349–7356, 2024

work page 2024
[20]

Static Shape Control of Soft Continuum Robots Using Deep Visual Inverse Kinematic Models,

E. Almanzor, F. Ye, J. Shi, T. G. Thuruthel, H. A. Wurdemann, and F. Iida, “Static Shape Control of Soft Continuum Robots Using Deep Visual Inverse Kinematic Models,”IEEE Transactions on Robotics, vol. 39, no. 4, pp. 2973–2988, 2023

work page 2023
[21]

Visuo-dynamic self-modelling of soft robotic systems,

R. Marques Monteiro, J. Shi, H. Wurdemann, F. Iida, and T. George Thuruthel, “Visuo-dynamic self-modelling of soft robotic systems,”Frontiers in Robotics and AI, vol. V olume 11 - 2024, 2024

work page 2024
[22]

Learning Low- Dimensional Strain Models of Soft Robots by Looking at the Evolution of Their Shape with Application to Model-Based Control,

R. Valadas, M. St ¨olzle, J. Liu, and C. D. Santina, “Learning Low- Dimensional Strain Models of Soft Robots by Looking at the Evolution of Their Shape with Application to Model-Based Control,” in2025 IEEE 8th International Conference on Soft Robotics (RoboSoft), 2025, pp. 1–8

work page 2025
[23]

beta-vae: Learning basic visual concepts with a constrained variational framework,

I. Higgins, L. Matthey, A. Pal, C. Burgess, X. Glorot, M. Botvinick, S. Mohamed, and A. Lerchner, “beta-vae: Learning basic visual concepts with a constrained variational framework,” inInternational conference on learning representations, 2017

work page 2017
[24]

Spatial broadcast decoder: A simple architecture for learning disentangled representations in vaes,

N. Watters, L. Matthey, C. P. Burgess, and A. Lerchner, “Spatial broadcast decoder: A simple architecture for learning disentangled representations in vaes,”arXiv preprint arXiv:1901.07017, 2019

work page arXiv 1901
[25]

Title blinded,

A. N., “Title blinded,” inConference X, 2021

work page 2021

[1] [1]

Modern koopman theory for dynamical systems,

S. L. Brunton, M. Budi ˇsi´c, E. Kaiser, and J. N. Kutz, “Modern koopman theory for dynamical systems,”SIAM Review, vol. 64, no. 2, pp. 229–340, 2022

work page 2022

[2] [2]

Deep learning for universal linear embeddings of nonlinear dynamics,

B. Lusch, J. N. Kutz, and S. L. Brunton, “Deep learning for universal linear embeddings of nonlinear dynamics,”Nature communications, vol. 9, no. 1, p. 4950, 2018

work page 2018

[3] [3]

Learning com- positional koopman operators for model-based control,

Y . Li, H. He, J. Wu, D. Katabi, and A. Torralba, “Learning com- positional koopman operators for model-based control,” in8th Inter- national Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, April 26-30, 2020. OpenReview.net, 2020

work page 2020

[4] [4]

Learning koopman invariant subspaces for dynamic mode decomposition,

N. Takeishi, Y . Kawahara, and T. Yairi, “Learning koopman invariant subspaces for dynamic mode decomposition,” inAdvances in Neural Information Processing Systems, I. Guyon, U. V . Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, Eds., vol. 30. Curran Associates, Inc., 2017

work page 2017

[5] [5]

Neural oscillators are uni- versal,

S. Lanthaler, T. K. Rusch, and S. Mishra, “Neural oscillators are uni- versal,” inAdvances in Neural Information Processing Systems, A. Oh, T. Naumann, A. Globerson, K. Saenko, M. Hardt, and S. Levine, Eds., vol. 36. Curran Associates, Inc., 2023, pp. 46 786–46 806

work page 2023

[6] [6]

Input-to-state stable coupled oscil- lator networks for closed-form model-based control in latent space,

M. St ¨olzle and C. Della Santina, “Input-to-state stable coupled oscil- lator networks for closed-form model-based control in latent space,” inAdvances in Neural Information Processing Systems, 2024

work page 2024

[7] [7]

Learning-Based Control Strategies for Soft Robots: Theory, Achieve- ments, and Future Challenges,

C. Laschi, T. G. Thuruthel, F. Lida, R. Merzouki, and E. Falotico, “Learning-Based Control Strategies for Soft Robots: Theory, Achieve- ments, and Future Challenges,”IEEE Control Systems Magazine, vol. 43, no. 3, pp. 100–113, 2023

work page 2023

[8] [8]

A review of learning-based dynamics models for robotic manipulation,

B. Ai, S. Tian, H. Shi, Y . Wang, T. Pfaff, C. Tan, H. I. Christensen, H. Su, J. Wu, and Y . Li, “A review of learning-based dynamics models for robotic manipulation,”Science Robotics, vol. 10, no. 106, p. eadt1497, 2025

work page 2025

[9] [9]

Data-Driven Control of Soft Robots Using Koopman Operator The- ory,

D. Bruder, X. Fu, R. B. Gillespie, C. D. Remy, and R. Vasudevan, “Data-Driven Control of Soft Robots Using Koopman Operator The- ory,”IEEE Transactions on Robotics, vol. 37, no. 3, pp. 948–961, 2021

work page 2021

[10] [10]

Koopman operators for modeling and control of soft robotics,

L. Shi, Z. Liu, and K. Karydis, “Koopman operators for modeling and control of soft robotics,”Current Robotics Reports, vol. 4, no. 2, pp. 23–31, 2023

work page 2023

[11] [11]

Control of soft robots with inertial dynamics,

D. A. Haggerty, M. J. Banks, E. Kamenar, A. B. Cao, P. C. Curtis, I. Mezi ´c, and E. W. Hawkes, “Control of soft robots with inertial dynamics,”Science Robotics, vol. 8, no. 81, p. eadd6864, 2023

work page 2023

[12] [12]

Physics-Informed Split Koopman Operators for Data-Efficient Soft Robotic Simulation,

E. Ristich, L. Zhang, Y . Ren, and J. Sun, “Physics-Informed Split Koopman Operators for Data-Efficient Soft Robotic Simulation,” in 2025 IEEE International Conference on Robotics and Automation (ICRA), 2025, pp. 9273–9279

work page 2025

[13] [13]

Physics-Informed Neural Net- works to Model and Control Robots: A Theoretical and Experimental Investigation,

J. Liu, P. Borja, and C. Della Santina, “Physics-Informed Neural Net- works to Model and Control Robots: A Theoretical and Experimental Investigation,”Advanced Intelligent Systems, vol. 6, no. 5, p. 2300385, 2024

work page 2024

[14] [14]

Adaptive model-predictive control of a soft continuum robot using a physics-informed neural network based on cosserat rod theory,

J. Licher, M. Bartholdt, H. Krauss, T.-L. Habich, T. Seel, and M. Schappler, “Adaptive model-predictive control of a soft continuum robot using a physics-informed neural network based on cosserat rod theory,” 2025

work page 2025

[15] [15]

Domain-Decoupled Physics-informed Neural Networks with Closed- Form Gradients for Fast Model Learning of Dynamical Systems,

H. Krauss, T.-L. Habich, M. Bartholdt, T. Seel, and M. Schappler, “Domain-Decoupled Physics-informed Neural Networks with Closed- Form Gradients for Fast Model Learning of Dynamical Systems,” in Proceedings of the 21st International Conference on Informatics in Control, Automation and Robotics. SCITEPRESS - Science and Technology Publications, 2024, pp. 55–66

work page 2024

[16] [16]

Structure-preserving model order reduction of slender soft robots via autoencoder-parameterized strain,

A. Y . Alkayas, A. T. Mathew, D. Feliu-Talegon, Y . Zweiri, T. G. Thuruthel, and F. Renda, “Structure-preserving model order reduction of slender soft robots via autoencoder-parameterized strain,”IEEE Robotics and Automation Letters, vol. 10, no. 10, pp. 11 006–11 013, 2025

work page 2025

[17] [17]

Soft synergies: Model order reduction of hybrid soft-rigid robots via optimal strain parameterization,

A. Y . Alkayas, A. T. Mathew, D. Feliu-Talegon, P. Deng, T. G. Thuruthel, and F. Renda, “Soft synergies: Model order reduction of hybrid soft-rigid robots via optimal strain parameterization,”IEEE Transactions on Robotics, vol. 41, pp. 1118–1137, 2025

work page 2025

[18] [18]

Vision-based online key point estimation of de- formable robots,

H. Zheng, S. Pinzello, B. G. Cangan, T. J. Buchner, and R. K. Katzschmann, “Vision-based online key point estimation of de- formable robots,”Advanced Intelligent Systems, vol. 6, no. 10, p. 2400105, 2024

work page 2024

[19] [19]

Vision-based real-time shape estimation of self- occluding soft parallel robots using neural networks,

Y . Rong and G. Gu, “Vision-based real-time shape estimation of self- occluding soft parallel robots using neural networks,”IEEE Robotics and Automation Letters, vol. 9, no. 8, pp. 7349–7356, 2024

work page 2024

[20] [20]

Static Shape Control of Soft Continuum Robots Using Deep Visual Inverse Kinematic Models,

E. Almanzor, F. Ye, J. Shi, T. G. Thuruthel, H. A. Wurdemann, and F. Iida, “Static Shape Control of Soft Continuum Robots Using Deep Visual Inverse Kinematic Models,”IEEE Transactions on Robotics, vol. 39, no. 4, pp. 2973–2988, 2023

work page 2023

[21] [21]

Visuo-dynamic self-modelling of soft robotic systems,

R. Marques Monteiro, J. Shi, H. Wurdemann, F. Iida, and T. George Thuruthel, “Visuo-dynamic self-modelling of soft robotic systems,”Frontiers in Robotics and AI, vol. V olume 11 - 2024, 2024

work page 2024

[22] [22]

Learning Low- Dimensional Strain Models of Soft Robots by Looking at the Evolution of Their Shape with Application to Model-Based Control,

R. Valadas, M. St ¨olzle, J. Liu, and C. D. Santina, “Learning Low- Dimensional Strain Models of Soft Robots by Looking at the Evolution of Their Shape with Application to Model-Based Control,” in2025 IEEE 8th International Conference on Soft Robotics (RoboSoft), 2025, pp. 1–8

work page 2025

[23] [23]

beta-vae: Learning basic visual concepts with a constrained variational framework,

I. Higgins, L. Matthey, A. Pal, C. Burgess, X. Glorot, M. Botvinick, S. Mohamed, and A. Lerchner, “beta-vae: Learning basic visual concepts with a constrained variational framework,” inInternational conference on learning representations, 2017

work page 2017

[24] [24]

Spatial broadcast decoder: A simple architecture for learning disentangled representations in vaes,

N. Watters, L. Matthey, C. P. Burgess, and A. Lerchner, “Spatial broadcast decoder: A simple architecture for learning disentangled representations in vaes,”arXiv preprint arXiv:1901.07017, 2019

work page arXiv 1901

[25] [25]

Title blinded,

A. N., “Title blinded,” inConference X, 2021

work page 2021