pith. machine review for the scientific record. sign in

arxiv: 2604.10357 · v4 · submitted 2026-04-11 · 💻 cs.CE

Recognition: unknown

A Total Lagrangian Finite Element Framework for Multibody Dynamics: Part II -- GPU Implementation and Numerical Experiments

Authors on Pith no claims yet

Pith reviewed 2026-05-10 15:07 UTC · model grok-4.3

classification 💻 cs.CE
keywords GPU accelerationfinite element methodmultibody dynamicstotal LagrangianNewton solvercontact modelingreal-time simulationimplicit time integration
0
0 comments X

The pith

GPU Newton solver reduces real-time factor by an order of magnitude for large flexible multibody simulations.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents a GPU-accelerated implementation of a total Lagrangian finite element method for finite-deformation multibody dynamics, including support for quadratic tetrahedral elements and ANCF beams and shells. It details a two-stage parallel strategy for evaluating internal forces and tangent stiffness, a Newton solver that uses fixed-sparsity refactorization for the global Hessian, and an asynchronous GPU collision algorithm on triangle meshes. Systematic benchmarks across element types and mesh sizes demonstrate that the GPU Newton approach achieves roughly ten times lower real-time factor than CPU baselines at the largest resolutions, while an augmented Lagrangian scheme handles constraints and a frictional contact model is checked against rigid-body formulas. These results matter for applications where modeling large deformable systems in real time is needed, such as vehicle dynamics or robotics.

Core claim

The central claim is that a GPU-native two-stage parallelization for force and stiffness evaluation, paired with fixed-sparsity cuDSS refactorization inside a Newton solver and a two-thread asynchronous collision handler, enables an order-of-magnitude reduction in real-time factor relative to CPU execution across T10, ANCF beam, and shell discretizations at the highest tested resolutions, while the velocity-based backward-Euler scheme with augmented Lagrangian constraints preserves the same numerical behavior as the reference implementation.

What carries the argument

The fixed-sparsity matrix strategy that eliminates repeated symbolic analysis and permits fast numerical refactorization of the sparse global Hessian on the GPU during Newton iterations.

If this is right

  • At the largest mesh resolutions the Newton solver delivers approximately one order of magnitude lower real-time factor than CPU baselines.
  • The frictional contact model matches closed-form rigid-body predictions in both quasi-static and dynamic impact tests.
  • The fixed-sparsity refactorization removes the cost of repeated symbolic analysis across Newton iterations.
  • The two-stage GPU parallelization applies uniformly to quadratic tetrahedral, ANCF beam, and shell elements with hyperelastic and viscous constitutive laws.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same parallel structure could be reused for other implicit time integrators beyond backward Euler in continuum mechanics problems.
  • Real-time capability at these resolutions opens direct use in hardware-in-the-loop testing of deformable mechanisms.
  • The collision algorithm's avoidance of bounding-volume hierarchies may extend to other triangle-soup contact problems outside multibody dynamics.

Load-bearing premise

The GPU parallelization and solver choices preserve numerical accuracy and stability equivalent to the CPU reference for every element type, mesh resolution, and contact scenario examined.

What would settle it

Running the identical initial condition and time step on both GPU and CPU implementations and finding statistically significant differences in final positions, velocities, or contact forces for any of the three element types would falsify equivalence.

read the original abstract

We present the numerical methods and GPU-accelerated implementation underlying a Total Lagrangian finite element framework for finite-deformation flexible multibody dynamics, introduced in the companion paper [1]. The framework supports 10-node quadratic tetrahedral (T10) elements and ANCF beam and shell elements, with quadrature-based hyperelastic response (St. Venant-Kirchhoff and Mooney-Rivlin) and an optional Kelvin-Voigt viscous stress contribution. Time stepping employs a velocity-based implicit backward-Euler scheme, yielding a nonlinear residual in velocity that couples inertia, internal and external forces, and bilateral constraints. Constraints are enforced via an augmented Lagrangian method (ALM), structured as an outer loop alternating an inner velocity solve with a dual-ascent multiplier update. We introduce a two-stage GPU parallelization strategy for internal force and tangent stiffness evaluation, and provide two inner solvers: a first-order AdamW optimizer and a second-order Newton solver that assembles and factorizes a sparse global Hessian on the GPU using cuDSS. A fixed-sparsity matrix strategy eliminates repeated symbolic analysis and enables efficient numerical refactorization across Newton iterations. For collision detection, we present a GPU-native two-thread asynchronous algorithm operating on triangle soups, avoiding bounding-volume hierarchies entirely. Systematic scaling benchmarks across all three supported element types and six mesh resolutions show that the Newton solver achieves approximately one order of magnitude reduction in real-time factor relative to CPU baselines at the largest resolutions tested. The frictional contact model is validated against closed-form rigid-body predictions through quasi-static and dynamic impact unit tests.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 3 minor

Summary. The manuscript presents the GPU-accelerated implementation and numerical experiments for a Total Lagrangian finite element framework for finite-deformation flexible multibody dynamics (companion to Part I). It supports 10-node quadratic tetrahedral (T10) elements and ANCF beam/shell elements with St. Venant-Kirchhoff and Mooney-Rivlin hyperelastic models plus optional Kelvin-Voigt viscosity. Time integration uses velocity-based implicit backward Euler, with constraints handled by an augmented Lagrangian method (ALM) featuring an outer dual-ascent loop and inner velocity solve. The implementation includes a two-stage GPU parallelization for internal force and tangent stiffness, a Newton solver assembling and factorizing sparse Hessians via cuDSS with fixed-sparsity refactorization (plus an AdamW optimizer alternative), and a GPU-native two-thread asynchronous collision algorithm on triangle soups without BVHs. Systematic scaling benchmarks across the three element types and six mesh resolutions report that the Newton solver achieves approximately one order of magnitude reduction in real-time factor versus CPU baselines at the largest resolutions; the frictional contact model is validated against closed-form rigid-body predictions in quasi-static and dynamic impact tests.

Significance. If the reported performance gains are achieved while preserving numerical accuracy and stability, the work provides a practical, scalable GPU framework for large-deformation flexible multibody simulations. The systematic benchmarks across multiple element types and resolutions, together with direct validation against closed-form solutions, strengthen the evidence for both efficiency and correctness. Such capabilities could enable previously intractable real-time or near-real-time analyses in robotics, biomechanics, and structural dynamics.

major comments (1)
  1. [Numerical Experiments] The central performance claim (approximately 10x reduction in real-time factor for the Newton solver) is predicated on the assumption that the two-stage GPU parallelization, fixed-sparsity cuDSS refactorization, and asynchronous collision algorithm preserve numerical accuracy and stability equivalent to the CPU reference. The manuscript does not appear to report quantitative error metrics (e.g., displacement or velocity L2 norms, residual histories, or convergence rates) comparing GPU and CPU solutions across the tested element types, mesh resolutions, and contact scenarios; without such data the speedup cannot be fully assessed as apples-to-apples.
minor comments (3)
  1. The term 'real-time factor' is used in the benchmark summary but is not explicitly defined (e.g., as wall-clock time divided by simulated time, or the inverse); a clear definition and units would improve interpretability of the scaling results.
  2. Acronyms such as ALM, ANCF, and cuDSS should be expanded at first use in the main text even if they appear in the abstract.
  3. The description of the two-thread asynchronous collision algorithm would benefit from a short pseudocode listing or diagram to clarify thread responsibilities and synchronization.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their constructive review and for recognizing the potential of the GPU-accelerated total Lagrangian framework. We address the single major comment below and will incorporate the suggested improvements in the revised manuscript.

read point-by-point responses
  1. Referee: [Numerical Experiments] The central performance claim (approximately 10x reduction in real-time factor for the Newton solver) is predicated on the assumption that the two-stage GPU parallelization, fixed-sparsity cuDSS refactorization, and asynchronous collision algorithm preserve numerical accuracy and stability equivalent to the CPU reference. The manuscript does not appear to report quantitative error metrics (e.g., displacement or velocity L2 norms, residual histories, or convergence rates) comparing GPU and CPU solutions across the tested element types, mesh resolutions, and contact scenarios; without such data the speedup cannot be fully assessed as apples-to-apples.

    Authors: We agree that explicit quantitative comparisons would strengthen the validation. The GPU implementation replicates the identical numerical algorithms from Part I (velocity-based implicit backward Euler, augmented Lagrangian constraints, St. Venant-Kirchhoff/Mooney-Rivlin hyperelasticity, and the same quadrature rules), so the solutions are mathematically equivalent; the two-stage parallelization, fixed-sparsity cuDSS refactorization, and asynchronous collision detection are purely computational optimizations that preserve the residual and Jacobian. Nevertheless, to address the concern directly, the revised manuscript will add a new subsection in Numerical Experiments that reports L2-norm differences in nodal displacements and velocities, plus Newton residual histories, between GPU and CPU runs for representative cases spanning all three element types, multiple mesh resolutions, and both frictional contact scenarios. These metrics will demonstrate that discrepancies remain at machine precision and do not affect the reported performance gains. revision: yes

Circularity Check

0 steps flagged

No significant circularity; benchmarks anchored to external CPU and closed-form references

full rationale

The paper describes a GPU implementation and scaling experiments for a Total Lagrangian FE framework whose theoretical foundation is introduced in the companion paper [1]. All load-bearing performance claims (one order of magnitude real-time factor reduction) are obtained by direct comparison against independent CPU baselines and closed-form rigid-body solutions for contact validation. No equations, fitted parameters, or uniqueness statements are shown to reduce by construction to the paper's own inputs or to a self-citation chain. The two-stage GPU parallelization, cuDSS refactorization, and asynchronous collision algorithm are presented as engineering choices whose correctness is assessed via external numerical equivalence tests rather than internal redefinition.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The work rests on standard continuum mechanics and finite-element assumptions for hyperelastic constitutive models and implicit integration; no new free parameters, ad-hoc axioms, or invented entities are introduced in the abstract.

axioms (1)
  • domain assumption Standard assumptions of finite strain continuum mechanics and quadrature-based hyperelastic constitutive response hold for T10, ANCF beam, and shell elements.
    Invoked implicitly by the choice of element types and material models described in the abstract.

pith-pipeline@v0.9.0 · 5591 in / 1458 out tokens · 111177 ms · 2026-05-10T15:07:54.108209+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. A Total Lagrangian Finite Element Framework for Multibody Dynamics: Part I -- Formulation

    cs.CE 2026-02 unverdicted novelty 5.0

    A Total Lagrangian finite element framework is derived for finite-deformation multibody dynamics with joints, contacts, and common material models.

Reference graph

Works this paper leans on

60 extracted references · 30 canonical work pages · cited by 1 Pith paper · 2 internal anchors

  1. [1]

    John Wiley & Sons, Chichester, UK (2000)

    Belytschko, T., Liu, W.K., Moran, B.: Nonlinear Finite Elements for Continua and Structures. John Wiley & Sons, Chichester, UK (2000)

  2. [2]

    Bonet, J., Gil, A.J., Wood, R.D.: Nonlinear Solid Mechanics for Finite Element 53 Parameter Tire bodies (×9) Shell beam E(Pa) 5.0×10 6 8.0×10 6 ν0.40 0.33 ρ0 (kg/m3) 900 1200 ηdamp =λ damp (Pa s) 2.0×10 4 2.0×10 4 T able A6Material parameters for the mixed item-dropping benchmark (SVK model). Category Parameter Value Time integration ∆t1×10 −4 s Steps (to...

  3. [3]

    A Total Lagrangian Finite Element Framework for Multibody Dynamics: Part I -- Formulation

    Zhou, Z., Arivoli, G., Negrut, D.: A Total Lagrangian Finite Element Framework for Multibody Dynamics: Part I – Formulation (2026). https://arxiv.org/abs/ 2602.17002

  4. [4]

    https://github.com/uwsbel/Total-Lagrangian-FEA

    Zhou, Z., Arivoli, G., SBEL, University of Wisconsin-Madison: Total-Lagrangian- FEA: Research Code for Total-Lagrangian Finite Element Method for Flex- ible Multibody Dynamics. https://github.com/uwsbel/Total-Lagrangian-FEA. 54 Parameter Motion Tests Pull Tests Model SVK SVK E(Pa) 2×10 6 1×10 7 ν0.30 0.30 ρ0 (kg/m3) 1200 1200 ηdamp =λ damp (Pa·s){0,10 2,1...

  5. [5]

    In: Proceedings of the Conference on Simulation and Visualization (SimVis), pp

    Georgii, J., Westermann, R.: Interactive simulation of deformable bodies on GPUs. In: Proceedings of the Conference on Simulation and Visualization (SimVis), pp. 209–218. SCS Publishing House, Erlangen, Germany (2005)

  6. [6]

    ACM Transactions on Graphics (TOG)33(4), 1–12 (2014)

    Macklin, M., M¨ uller, M., Chentanez, N., Kim, T.-Y.: Unified particle physics for real-time applications. ACM Transactions on Graphics (TOG)33(4), 1–12 (2014)

  7. [7]

    ACM Transactions on Graphics (TOG)38(6), 1–16 (2019)

    Hu, Y., Li, T.-M., Anderson, L., Ragan-Kelley, J., Durand, F.: Taichi: a lan- guage for high-performance computation on spatially sparse data structures. ACM Transactions on Graphics (TOG)38(6), 1–16 (2019)

  8. [8]

    In: Soft Tissue Biomechanical Mod- eling for Computer Assisted Surgery, pp

    Faure, F., Duriez, C., Delingette, H., Allard, J., Gilles, B., Marchesseau, S., Tal- bot, H., Courtecuisse, H., Bousquet, G., Peterlik, I.,et al.: Sofa: A multi-model framework for interactive physical simulation. In: Soft Tissue Biomechanical Mod- eling for Computer Assisted Surgery, pp. 283–321. Springer, Berlin, Heidelberg (2012)

  9. [9]

    https://developer.nvidia.com/cudss

    Corporation, N.: NVIDIA cuDSS: GPU-Accelerated Direct Sparse Solver Library. https://developer.nvidia.com/cudss. Version 0.5.0, Preview Release (2025)

  10. [10]

    Wiley, Chichester, UK (1994)

    G´ eradin, M., Rixen, D.: Mechanical Vibrations : Theory and Application to Structural Dynamics, 2nd edn. Wiley, Chichester, UK (1994)

  11. [11]

    Mechanism and Machine Theory48, 121–137 (2012)

    Br¨ uls, O., Cardona, A., Arnold, M.: Lie group generalized-αtime integration of constrained flexible multibody systems. Mechanism and Machine Theory48, 121–137 (2012)

  12. [12]

    Computer Methods in Applied Mechanics and Engineering1, 1–16 (1972)

    Baumgarte, J.: Stabilization of constraints and integrals of motion in dynami- cal systems. Computer Methods in Applied Mechanics and Engineering1, 1–16 (1972)

  13. [13]

    Athena Scientific, Belmont, MA (1995) 55

    Bertsekas, D.P.: Nonlinear Programming. Athena Scientific, Belmont, MA (1995) 55

  14. [14]

    Springer, New York, NY (1999)

    Nocedal, J., Wright, S.: Numerical Optimization. Springer, New York, NY (1999)

  15. [15]

    Computers & Graphics30(3), 450–459 (2006) https://doi.org/10.1016/j.cag.2006.02.011

    Larsson, T., Akenine-M¨ oller, T.: A dynamic bounding volume hierarchy for generalized collision detection. Computers & Graphics30(3), 450–459 (2006) https://doi.org/10.1016/j.cag.2006.02.011

  16. [16]

    Computer Graphics Forum29(2), 419–428 (2010) https://doi.org/10.1111/j.1467-8659.2009.01611.x

    Lauterbach, C., Garland, M., Sengupta, S., Luebke, D., Manocha, D.: Fast BVH construction on GPUs. Computer Graphics Forum28(2), 375–384 (2009) https: //doi.org/10.1111/j.1467-8659.2009.01377.x

  17. [17]

    Computer Graphics Forum 24(1), 61–81 (2005) https://doi.org/10.1111/j.1467-8659.2005.00829.x

    Teschner, M., Kimmerle, S., Heidelberger, B., Zachmann, G., Raghupathi, L., Fuhrmann, A., Cani, M.-P., Faure, F., Magnenat-Thalmann, N., Strasser, W., Volino, P.: Collision detection for deformable objects. Computer Graphics Forum 24(1), 61–81 (2005) https://doi.org/10.1111/j.1467-8659.2005.00829.x

  18. [18]

    Archive of Numerical Software3(100), 9–23 (2015) https://doi.org/10.11588/ans

    Alnæs, M.S., Blechta, J., Hake, J., Johansson, A., Kehlet, B., Logg, A., Richard- son, C., Ring, J., Rognes, M.E., Wells, G.N.: The FEniCS project version 1.5. Archive of Numerical Software3(100), 9–23 (2015) https://doi.org/10.11588/ans. 2015.100.20553

  19. [19]

    Engineering with Computers38(5), 4099–4113 (2022) https: //doi.org/10.1007/s00366-021-01597-z

    Mazier, A., Bilger, A., Forte, A.E., Peterlik, I., Hale, J.S., Bordas, S.P.A.: Inverse deformation analysis: an experimental and numerical assessment using the FEniCS project. Engineering with Computers38(5), 4099–4113 (2022) https: //doi.org/10.1007/s00366-021-01597-z

  20. [20]

    JAX-FEM: A differen- tiable GPU-accelerated 3D finite element solver for automatic inverse design and mechanistic data science

    Xue, T., Liao, S., Gan, Z., Park, C., Xie, X., Liu, W.K., Cao, J.: JAX-FEM: A differentiable GPU-accelerated 3D finite element solver for automatic inverse design and mechanistic data science. Computer Physics Communications291, 108802 (2023) https://doi.org/10.1016/j.cpc.2023.108802

  21. [21]

    In: Kozubek, T., Blaheta, R., ˇS´ ıstek, J., Rozloˇ zn´ ık, M.,ˇCerm´ ak, M

    Tasora, A., Serban, R., Mazhar, H., Pazouki, A., Melanz, D., Fleischmann, J., Taylor, M., Sugiyama, H., Negrut, D.: Chrono: An open source multi-physics dynamics engine. In: Kozubek, T., Blaheta, R., ˇS´ ıstek, J., Rozloˇ zn´ ık, M.,ˇCerm´ ak, M. (eds.) High Performance Computing in Science and Engineering, pp. 19–49. Springer, Cham (2016). https://doi.or...

  22. [22]

    Nachbagauer, K., Gruber, P., Gerstmayr, J.: Structural and continuum mechanics approaches for a 3D shear deformable ANCF beam finite element: Application to static and linearized dynamic examples. J. Comput. Nonlinear Dynam8 (2), 021004 (2012)

  23. [23]

    Journal of Computational and Nonlinear Dynamics 10(5), 051012 (2015)

    Yamashita, H., Valkeap¨ a¨ a, A.I., Jayakumar, P., Sugiyama, H.: Continuum mechanics based bilinear shear deformable shell element using Absolute Nodal Coordinate Formulation. Journal of Computational and Nonlinear Dynamics 10(5), 051012 (2015)

  24. [24]

    International Journal of Non-Linear Mechanics149, 104308 (2023) https://doi.org/10.1016/j.ijnonlinmec.2022.104308

    Taylor, M., Serban, R., Negrut, D.: An efficiency comparison of different ANCF 56 implementations. International Journal of Non-Linear Mechanics149, 104308 (2023) https://doi.org/10.1016/j.ijnonlinmec.2022.104308

  25. [25]

    International Journal of Non-Linear Mechanics149, 104328 (2023) https://doi.org/10.1016/j.ijnonlinmec.2022.104328

    Taylor, M., Serban, R., Negrut, D.: Implementation implications on the perfor- mance of ANCF simulations. International Journal of Non-Linear Mechanics149, 104328 (2023) https://doi.org/10.1016/j.ijnonlinmec.2022.104328

  26. [26]

    ASME Journal of Mechanical Design123, 606–613 (2001)

    Shabana, A.A., Yakoub, R.Y.: Three dimensional absolute nodal coordinate for- mulation for beam elements: Theory. ASME Journal of Mechanical Design123, 606–613 (2001)

  27. [27]

    In: Proceedings of High Performance Graphics 2012, pp

    Karras, T.: Maximizing parallelism in the construction of BVHs, octrees, and k-d trees. In: Proceedings of High Performance Graphics 2012, pp. 33–37. The Euro- graphics Association, Goslar, Germany (2012). https://doi.org/10.2312/EGGH/ HPG12/033-037

  28. [28]

    Computer Graphics Forum29(2), 419–428 (2010) https://doi.org/10.1111/j.1467-8659.2009.01611.x

    Lauterbach, C., Mo, Q., Manocha, D.: gproximity: Hierarchical GPU-based opera- tions for collision and distance queries. Computer Graphics Forum29(2), 419–428 (2010) https://doi.org/10.1111/j.1467-8659.2009.01611.x

  29. [29]

    In: Proceedings of the 2011 Symposium on Interactive 3D Graphics and Games (I3D ’11), pp

    Tang, M., Curtis, S., Yoon, S., Manocha, D.: Collision-streams: Fast GPU-based collision detection for deformable models. In: Proceedings of the 2011 Symposium on Interactive 3D Graphics and Games (I3D ’11), pp. 63–70. ACM, New York, NY (2011). https://doi.org/10.1145/1944745.1944756

  30. [30]

    In: Proceedings of the ACM SIG- GRAPH Symposium on Interactive 3D Graphics and Games (I3D ’18), pp

    Chitalu, F.M., Dubach, C., Komura, T.: Bulk-synchronous parallel simultaneous BVH traversal for collision detection on GPUs. In: Proceedings of the ACM SIG- GRAPH Symposium on Interactive 3D Graphics and Games (I3D ’18), pp. 4–19. ACM, New York, NY (2018). https://doi.org/10.1145/3190834.3190848

  31. [31]

    IEEE Journal on Robotics and Automation4(2), 193–203 (1988) https://doi.org/10.1109/56.2083

    Gilbert, E.G., Johnson, D.W., Keerthi, S.S.: A fast procedure for computing the distance between complex objects in three-dimensional space. IEEE Journal on Robotics and Automation4(2), 193–203 (1988) https://doi.org/10.1109/56.2083

  32. [32]

    In: Jacobs, S

    Snethen, G.: XenoCollide: Complex collision made simple. In: Jacobs, S. (ed.) Game Programming Gems 7, pp. 165–178. Charles River Media, Hingham, MA (2008)

  33. [33]

    The International Journal of Robotics Research31(2), 187–200 (2012) https://doi.org/10.1177/0278364911429335

    Pan, J., Manocha, D.: GPU-based parallel collision detection for fast motion planning. The International Journal of Robotics Research31(2), 187–200 (2012) https://doi.org/10.1177/0278364911429335

  34. [34]

    ACM Transactions on Graphics15(3), 179–210 (1996) https://doi.org/ 10.1145/231731.231732 57

    Hubbard, P.M.: Approximating polyhedra with spheres for time-critical collision detection. ACM Transactions on Graphics15(3), 179–210 (1996) https://doi.org/ 10.1145/231731.231732 57

  35. [35]

    ACM Transactions on Graphics23(1), 1–26 (2004) https:// doi.org/10.1145/966131.966132

    Bradshaw, G., O’Sullivan, C.: Adaptive medial-axis approximation for sphere- tree construction. ACM Transactions on Graphics23(1), 1–26 (2004) https:// doi.org/10.1145/966131.966132

  36. [36]

    Computer Physics Communications300, 109196 (2024) https://doi.org/10.1016/ j.cpc.2024.109196

    Zhang, R., Tagliafierro, B., Vanden Heuvel, C., Sabarwal, S., Bakke, L., Yue, Y., Wei, X., Serban, R., Negrut, D.: Chrono DEM-Engine: A Discrete Element Method dual-GPU simulator with customizable contact forces and element shape. Computer Physics Communications300, 109196 (2024) https://doi.org/10.1016/ j.cpc.2024.109196

  37. [37]

    Multibody System Dynamics26, 37–55 (2011)

    Mazhar, H., Heyn, T., Negrut, D.: A scalable parallel method for large col- lision detection problems. Multibody System Dynamics26, 37–55 (2011). 10.1007/s11044-011-9246-y

  38. [38]

    Journal of Computational and Nonlinear Dynamics11(4), 044502 (2016)

    Fleischmann, J., Serban, R., Negrut, D., Jayakumar, P.: On the importance of displacement history in soft-body contact models. Journal of Computational and Nonlinear Dynamics11(4), 044502 (2016)

  39. [39]

    Cambridge University Press, Cambridge, UK (1987)

    Johnson, K.L.: Contact Mechanics. Cambridge University Press, Cambridge, UK (1987)

  40. [40]

    SIAM, 2 edition, 2003

    Saad, Y.: Iterative Methods for Sparse Linear Systems, 2nd edn. Society for Industrial and Applied Mathematics (SIAM), Philadelphia, PA (2003). https: //doi.org/10.1137/1.9780898718003 .https://doi.org/10.1137/1.9780898718003

  41. [41]

    The thin film equation with nonlinear diffusion

    Davis, T.A.: Direct Methods for Sparse Linear Systems. Society for Industrial and Applied Mathematics (SIAM), Philadelphia, PA (2006). https://doi.org/10. 1137/1.9780898718881 .https://doi.org/10.1137/1.9780898718881

  42. [42]

    In: Arge, E., Bruaset, A.M., Langtangen, H.P

    Balay, S., Gropp, W.D., McInnes, L.C., Smith, B.F.: Efficient management of parallelism in object oriented numerical software libraries. In: Arge, E., Bruaset, A.M., Langtangen, H.P. (eds.) Modern Software Tools in Scientific Computing, pp. 163–202. Birkh¨ auser Press, Boston, MA (1997)

  43. [43]

    ACM Transactions on Mathematical Software 31(3), 397–423 (2005) https://doi.org/10.1145/1089014.1089021

    Heroux, M.A., Bartlett, R.A., Howle, V.E., Hoekstra, R.J., Hu, J.J., Kolda, T.G., Lehoucq, R.B., Long, K.R., Pawlowski, R.P., Phipps, E.T., Salinger, A.G., Thorn- quist, H.K., Tuminaro, R.S., Willenbring, J.M., Williams, A., Stanley, K.S.: An overview of the Trilinos project. ACM Transactions on Mathematical Software 31(3), 397–423 (2005) https://doi.org/...

  44. [44]

    Anderson, J

    Anderson, R., Andrej, J., Barker, A., Bramwell, J., Camier, J.-S., Cerveny, J., Dobrev, V., Dudouit, Y., Fisher, A., Kolev, T., Pazner, W., Stowell, M., Tomov, V., Akkerman, I., Dahm, J., Medina, D., Zampini, S.: MFEM: A modular finite element methods library. Computers & Mathematics with Applications81, 42–74 (2021) https://doi.org/10.1016/j.camwa.2020.06.009

  45. [45]

    NVIDIA Research

    Merrill, D.: CUB: CUDA UnBound, a library of warp-wide, block-wide, and 58 device-wide GPU parallel primitives. NVIDIA Research. Available online at http://nvlabs.github.io/cub/ (2015)

  46. [46]

    Decoupled Weight Decay Regularization

    Loshchilov, I., Hutter, F.: Decoupled weight decay regularization. In: Interna- tional Conference on Learning Representations (ICLR) (2019). https://doi.org/ 10.48550/arXiv.1711.05101

  47. [47]

    http://projectchrono.org

    Project Chrono: Chrono: An Open Source Framework for the Physics-Based Simulation of Dynamic Systems. http://projectchrono.org. Accessed: 2020-03-03 (2020)

  48. [48]

    Technical Report ANL-95/11 - Revision 3.5, Argonne National Laboratory (2014)

    Balay, S., Abhyankar, S., Adams, M.F., Brown, J., Brune, P., Buschelman, K., Eijkhout, V., Gropp, W.D., Kaushik, D., Knepley, M.G., McInnes, L.C., Rupp, K., Smith, B.F., Zhang, H.: PETSc users manual. Technical Report ANL-95/11 - Revision 3.5, Argonne National Laboratory (2014). http://www.mcs.anl.gov/ petsc

  49. [49]

    Communications of the ACM52(4), 65–76 (2009)

    Williams, S., Waterman, A., Patterson, D.: Roofline: an insightful visual perfor- mance model for multicore architectures. Communications of the ACM52(4), 65–76 (2009)

  50. [50]

    Advanced Powder Technology14(4), 435–448 (2003)

    Wu, C.-Y., Thornton, C., Li, L.-Y.: Coefficients of restitution for elastoplastic oblique impacts. Advanced Powder Technology14(4), 435–448 (2003)

  51. [51]

    Powder Technology319, 102–116 (2017)

    Yu, K., Elghannay, H.A., Tafti, D.: An impulse based model for spherical particle collisions with sliding and rolling. Powder Technology319, 102–116 (2017)

  52. [52]

    Department of Defense: Package Cushioning Design

    U.S. Department of Defense: Package Cushioning Design. Military Handbook MIL-HDBK-304B, U.S. Department of Defense (October 1978)

  53. [53]

    Materials13(21), 4968 (2020) https: //doi.org/10.3390/ma13214968

    H´ ulan, T.,ˇStubˇ na, I., Ondruˇ ska, J., Cs´ aki,ˇS., Luk´ aˇ c, F., M´ anik, M., Voz´ ar, L., Ozolins, J., Kaljuvee, T., Trn´ ık, A.: Young’s modulus of different illitic clays during heating and cooling stage of firing. Materials13(21), 4968 (2020) https: //doi.org/10.3390/ma13214968

  54. [54]

    Materials17(2), 300 (2024) https://doi.org/10.3390/ma17020300

    Kojima, S.: Poisson’s ratio of glasses, ceramics, and crystals. Materials17(2), 300 (2024) https://doi.org/10.3390/ma17020300

  55. [55]

    Dassault Syst` emes Simulia Corp.: Hyperelastic Behavior of Rubberlike Materials. (2024). Abaqus 2024 Documentation, accessed 2026-03-16. https: //docs.software.vt.edu/abaqusv2024/English/?show=SIMACAEMATRefMap% 2Fsimamat-c-hyperelastic.htm

  56. [56]

    Applied Sciences11(10), 4384 (2021) https://doi.org/10.3390/app11104384 59

    Han, D., Che, W.: Comparison of the shear modulus of an offshore elastomeric bearing between numerical simulation and experiment. Applied Sciences11(10), 4384 (2021) https://doi.org/10.3390/app11104384 59

  57. [57]

    Scientific Reports 16, 3612 (2026) https://doi.org/10.1038/s41598-025-33673-5

    Mamashli, H., Gerdooei, M., Ghafourian Nosrati, H.: Parametric study of piercing force and surface quality in elastomer-assisted tube piercing. Scientific Reports 16, 3612 (2026) https://doi.org/10.1038/s41598-025-33673-5 . Published online 2025-12-24; version of record 2026-01-27

  58. [58]

    Polymers16(1), 34 (2024) https://doi.org/10.3390/polym16010034

    Chen, H., Sun, D., Gao, L., Liu, X., Zhang, M.: Mechanical behavior of closed- cell ethylene-vinyl acetate foam under compression. Polymers16(1), 34 (2024) https://doi.org/10.3390/polym16010034

  59. [59]

    Datasheet

    The Rubber Company: 50 Shore Neoprene Rubber Sheeting Datasheet. Datasheet. Density = 1.35 g/cm 3 (1350 kg/m3) (n.d.)

  60. [60]

    60 Shore datasheet

    Delta Rubber: Neoprene Rubber Sheets / BS2752. 60 Shore datasheet. Density = 1.40 g/cm 3 (1400 kg/m3) (n.d.) 60 Fig. 19Revolute-joint vertical pulling test for four loading cases (−20,−40,−60,−80 N). Top: Cartesian reaction-force histories at the upper (top row) and lower (bottom row) joints;Fz dominates whileF x,F y remain near zero. Bottom: constraint v...