pith. sign in

arxiv: 2604.06100 · v3 · pith:5YWKUSTTnew · submitted 2026-04-07 · 💻 cs.CR

Signature Placement in Post-Quantum TLS Certificate Hierarchies: An Experimental Study of ML-DSA and SLH-DSA in TLS 1.3 Authentication

Pith reviewed 2026-05-21 10:28 UTC · model grok-4.3

classification 💻 cs.CR
keywords post-quantum TLScertificate hierarchiesML-DSASLH-DSATLS 1.3 authenticationhandshake latencysignature placement
0
0 comments X

The pith

Placing SLH-DSA in the server leaf certificate of a post-quantum TLS hierarchy increases handshake latency and server compute by orders of magnitude.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper investigates how the position of post-quantum signature algorithms within a TLS certificate chain affects real-world authentication costs. Experiments show that SLH-DSA at the leaf position creates a sharp performance penalty on the server, while restricting it to upper layers and using ML-DSA at the leaf keeps costs within practical bounds. This matters because post-quantum migration requires balancing security with the latency and resource demands of live TLS sessions. The study tests multiple hierarchy depths, placements, and key-exchange combinations in a controlled OpenSSL setup to isolate where the costs concentrate.

Core claim

When SLH-DSA appears in the server leaf certificate, handshake latency and server-side cryptographic cost rise by orders of magnitude; confining SLH-DSA to intermediate or root certificates and retaining ML-DSA at the leaf keeps performance in a usable range. Transport size alone does not explain the difference: outside leaf-SLH cases, byte counts track latency, but leaf-SLH cases are dominated by server computation rather than data transfer.

What carries the argument

The placement of ML-DSA and SLH-DSA within the certificate hierarchy, which controls both the portion of the chain sent during the TLS handshake and the split of signing versus verification work between server and client.

If this is right

  • Strategies that keep SLH-DSA out of the leaf certificate avoid the heavy performance regime.
  • Chain size and transferred bytes predict latency only when the leaf uses the lighter algorithm.
  • Hybrid key exchanges do not remove the discontinuity created by a leaf SLH-DSA certificate.
  • Deeper hierarchies amplify the cost difference when the leaf signature is expensive.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Operators may standardize on ML-DSA for all server leaf certificates and reserve SLH-DSA for long-lived root or intermediate certificates.
  • The same placement logic could guide selection among other fast versus slow post-quantum signature families.
  • Load-testing under concurrent connections would show whether the leaf-SLH penalty limits server scalability.

Load-bearing premise

Measurements taken in a local lab with OpenSSL 3 and oqsprovider reflect the costs that would appear in production TLS deployments under realistic network and load conditions.

What would settle it

Repeating the TLS 1.3 handshakes on production hardware and networks with SLH-DSA forced into the leaf certificate and measuring whether server CPU time and latency still jump by orders of magnitude.

Figures

Figures reproduced from arXiv: 2604.06100 by Jos\'e Luis Delgado Jim\'enez.

Figure 1
Figure 1. Figure 1: Mean handshake latency by scenario. those regions lies a smaller intermediate band, populated mainly by configurations that place SLH-DSA in upper hierarchy layers while preserving ML-DSA in the interactive leaf [PITH_FULL_IMAGE:figures/full_fig_p016_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: P95 handshake latency by scenario. 6.2.1 Classical KEX Under classical X25519, replacing an ML-DSA leaf with an SLH-DSA leaf produces an extreme discontinuity. Mean latency increases by approximately 2127.86×, and p95 by approximately 1750.57×. The transport expansion is much smaller: bytes_read_mean grows by about 3.35×, and chain_bytes_unique by about 2.98×. The same asymmetry appears in the performance … view at source ↗
Figure 3
Figure 3. Figure 3: Campaign A: leaf-only comparison. 6.2.3 Main result of Campaign A Campaign A establishes the first decisive empirical result of the paper. The dominant operational collapse does not require a complex hierarchy, a mixed certification strategy, or a pure post￾quantum key-establishment path. It appears as soon as SLH-DSA is moved into the server leaf certificate. The main discontinuity is therefore visible be… view at source ↗
Figure 4
Figure 4. Figure 4: Campaign B: normalized strategy matrix. 6.3.1 Fully-ML baseline The baseline strategy for Campaign B is the hybrid depth-3 hierarchy x25519mlkem768__ml_root__ml_int__ml_leaf. Its mean latency is 0.809 ms, its p95 is 1.000 ms, it reads 16008 bytes per handshake on average, and its server task-clock per run is approximately 0.562 ms. This scenario serves as the reference point for the relative comparisons re… view at source ↗
Figure 5
Figure 5. Figure 5: Operational plausibility ranking by scenario, expressed as latency relative to the fully [PITH_FULL_IMAGE:figures/full_fig_p021_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Depth comparison by family. 6.4.4 Main result of Campaign C Campaign C shows that hierarchy depth is not, by itself, a cost verdict. Whether additional depth penalizes, preserves, or even reduces observed cost depends on how the topology changes the effective chain seen during the handshake. This matters for the interpretation of post-quantum certificate hierarchies because it shows that logical depth and … view at source ↗
Figure 7
Figure 7. Figure 7: Relative effect of moving from depth 2 to depth 3 by comparable family, expressed as [PITH_FULL_IMAGE:figures/full_fig_p023_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: KEX comparison on comparable chains [PITH_FULL_IMAGE:figures/full_fig_p025_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Signature placement summary [PITH_FULL_IMAGE:figures/full_fig_p026_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Bytes read versus mean handshake latency. [PITH_FULL_IMAGE:figures/full_fig_p027_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: Observed chain size versus mean handshake latency. [PITH_FULL_IMAGE:figures/full_fig_p028_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: Transport overhead bytes by scenario. partially entangled. The inclusion of server-side performance counters in the final dataset closes that gap and allows the heavy regime to be interpreted in workload terms rather than only in latency terms. Given that leaf-SLH defines a distinct heavy regime, is that regime balanced across client and server, shifted toward validation on the client, or overwhelmingly c… view at source ↗
Figure 13
Figure 13. Figure 13: Server task-clock by scenario. 8.4 Overwhelmingly server-bound regime in leaf-SLH scenarios The decisive result of the decomposition appears once SLH-DSA reaches the leaf. In that class, mean client task-clock is only about 3.0402 ms, while mean server task-clock rises to approx￾imately 1410.8409 ms. The normalized ratios are even more revealing: the client contributes only about 0.0022 of elapsed time in… view at source ↗
Figure 14
Figure 14. Figure 14: Client versus server task-clock by scenario. [PITH_FULL_IMAGE:figures/full_fig_p032_14.png] view at source ↗
Figure 15
Figure 15. Figure 15: Client versus server task-clock by scenario on log-log axes. The diagonal marks [PITH_FULL_IMAGE:figures/full_fig_p033_15.png] view at source ↗
Figure 16
Figure 16. Figure 16: Server task-clock over elapsed time. What appears as a per-handshake cryptographic penalty becomes a constraint on concurrency, scaling, and operational headroom. This is why the paper moves beyond raw latency. Once the client/server decomposition showed that the heavy regime is overwhelmingly server-side, the relevant unit of interpretation became server CPU time per handshake. From that point onward, th… view at source ↗
Figure 17
Figure 17. Figure 17: Server-to-client task-clock ratio by scenario. [PITH_FULL_IMAGE:figures/full_fig_p035_17.png] view at source ↗
Figure 18
Figure 18. Figure 18: Infrastructure multiplier needed to preserve baseline throughput. [PITH_FULL_IMAGE:figures/full_fig_p037_18.png] view at source ↗
Figure 19
Figure 19. Figure 19: Extra cost per million handshakes. performance, but from the perspective of interactive service engineering they belong in the Unsuitable class. The heavy regime is operationally disqualifying for front-end TLS use. 10 Threats to Validity and Limitations The conclusions of this paper are strong within the experimental setting studied, but they should not be read as universal statements about every possibl… view at source ↗
Figure 20
Figure 20. Figure 20: Monthly cost by service class. 10.2 Single implementation stack The study is conducted on a specific implementation stack, namely OpenSSL 3 together with oqsprovider and liboqs. The results are implementation-dependent, different TLS libraries, certificate-handling paths, provider integrations, or post-quantum implementations could shift absolute values and potentially alter some lower-level trade-offs. F… view at source ↗
read the original abstract

Post-quantum migration in TLS 1.3 couples signature-algorithm choice with certificate-hierarchy structure, chain exposure during the handshake, and role-dependent cryptographic cost. In certificate-based authentication, the practical effect of a signature family depends on where it appears in the certification hierarchy, how much of that hierarchy is exposed during the handshake, and how the resulting cryptographic cost is distributed across client and server roles. Post-quantum TLS migration must therefore be evaluated as cryptographic design within authenticated key establishment, with algorithm selection assessed in its deployment context. This paper presents a local experimental study of TLS 1.3 authentication strategies implemented with OpenSSL 3 and oqsprovider. Using a reproducible laboratory setting, it compares ML-DSA and SLH-DSA across multiple certificate placements, hierarchy depths, and key-exchange modes, including classical, hybrid, and pure post-quantum configurations. The analysis is organized into four complementary campaigns: a leaf-only comparison, a full hierarchy strategy matrix, a depth comparison, and a key-exchange exploration. Across the experimental matrix, the main discontinuity appears when SLH-DSA is placed in the server leaf certificate. In that configuration, handshake latency and server-side compute cost increase by orders of magnitude, whereas strategies that confine SLH-DSA to upper trust layers and preserve ML-DSA in the interactive leaf remain within a more plausible operational range. The results also show that transport size alone does not explain the heavy regime: outside leaf-SLH scenarios, transferred bytes and observed chain size track latency closely, but once SLH-DSA reaches the leaf, server-side cryptographic cost becomes dominant.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper conducts a local experimental study of TLS 1.3 certificate hierarchies using OpenSSL 3 and oqsprovider, comparing ML-DSA and SLH-DSA placements across four campaigns (leaf-only, full hierarchy strategy matrix, depth comparison, and key-exchange exploration). It reports that placing SLH-DSA in the server leaf certificate produces orders-of-magnitude increases in handshake latency and server-side compute cost, while confining SLH-DSA to upper layers and retaining ML-DSA at the leaf keeps costs in a plausible range; transport size alone does not explain the heavy regime.

Significance. If the observed discontinuity holds, the work supplies concrete, actionable guidance for post-quantum TLS migration by showing that hybrid hierarchy designs can avoid prohibitive leaf costs. The reproducible laboratory setup and systematic coverage of four complementary campaigns are strengths that allow direct replication and extension.

major comments (2)
  1. [§4.2 and §5] §4.2 (full hierarchy strategy matrix) and §5 (results): the central operational claim—that leaf-SLH-DSA placements are impractical while upper-layer SLH-DSA with ML-DSA leaf remains viable—rests on a zero-latency local loopback or low-RTT LAN configuration. No experiments with added RTT, packet loss, or concurrent-connection load are reported, leaving open whether the fixed cryptographic penalty would remain dominant once network and queuing delays are convolved with the measurements.
  2. [§3 and §5] §3 (experimental methodology) and §5: the four campaigns report clear qualitative discontinuities but supply neither per-configuration repetition counts, error bars, nor statistical tests for the latency and server-compute figures. Without these, it is impossible to judge whether the reported orders-of-magnitude gap is robust to measurement variability or post-hoc selection of representative runs.
minor comments (2)
  1. [Figures 3–7] Figure captions and axis labels in the latency and byte-transfer plots could explicitly note the number of handshake repetitions and the exact timing points measured (e.g., from ClientHello to ServerFinished).
  2. [§4.4] The key-exchange exploration campaign (§4.4) mixes classical, hybrid, and pure-PQ KEMs; a short table summarizing which KEM was paired with each signature placement would improve readability.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We are grateful to the referee for the detailed and constructive feedback on our paper. We address each of the major comments below and describe the planned revisions.

read point-by-point responses
  1. Referee: [§4.2 and §5] §4.2 (full hierarchy strategy matrix) and §5 (results): the central operational claim—that leaf-SLH-DSA placements are impractical while upper-layer SLH-DSA with ML-DSA leaf remains viable—rests on a zero-latency local loopback or low-RTT LAN configuration. No experiments with added RTT, packet loss, or concurrent-connection load are reported, leaving open whether the fixed cryptographic penalty would remain dominant once network and queuing delays are convolved with the measurements.

    Authors: We appreciate this observation. The experiments were designed as a controlled laboratory study to isolate and quantify the pure cryptographic costs of different certificate hierarchy strategies without confounding network effects. This local setup (using loopback or low-RTT LAN) allows us to clearly demonstrate the orders-of-magnitude increase in server compute time for leaf-SLH-DSA placements. We agree that real-world scenarios include network delays, and the absolute handshake times would be higher. Nevertheless, the relative cost difference is large enough that leaf-SLH-DSA would likely remain impractical. We will add a paragraph in the discussion section (§5) acknowledging this limitation of the experimental environment and noting that future studies should incorporate network emulation tools to validate the findings under varied RTT and load conditions. This constitutes a partial revision. revision: partial

  2. Referee: [§3 and §5] §3 (experimental methodology) and §5: the four campaigns report clear qualitative discontinuities but supply neither per-configuration repetition counts, error bars, nor statistical tests for the latency and server-compute figures. Without these, it is impossible to judge whether the reported orders-of-magnitude gap is robust to measurement variability or post-hoc selection of representative runs.

    Authors: This is a fair critique. To ensure reproducibility, we conducted multiple independent runs for each configuration in our laboratory setup. However, we omitted the specific counts, variability measures, and any statistical analysis from the manuscript. In the revised version, we will expand §3 to detail the repetition counts (we performed 10 runs per configuration) and include error bars on the figures in §5 to represent the standard deviation across runs. Given the magnitude of the observed differences, formal statistical tests were not necessary, but we will add a note confirming the consistency of the results across repetitions. revision: yes

Circularity Check

0 steps flagged

No significant circularity; experimental measurements are direct

full rationale

The paper conducts a laboratory experimental study of TLS 1.3 authentication using OpenSSL 3 and oqsprovider, directly measuring handshake latency, server compute cost, and transport sizes across certificate hierarchy placements, depths, and key-exchange modes for ML-DSA and SLH-DSA. The reported discontinuity when SLH-DSA occupies the server leaf certificate is presented as an observed outcome of these measurements rather than any derived prediction, fitted parameter, or equation. No self-citations, ansatzes, uniqueness theorems, or renamings of known results are invoked to support the central claims; the findings rest on reproducible empirical data collection in a controlled setting. The derivation chain is therefore self-contained and does not reduce any result to its inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The study rests on standard assumptions about TLS 1.3 handshake mechanics and the correctness of the OpenSSL/oqsprovider implementations; no free parameters, ad-hoc axioms, or invented entities are introduced.

axioms (1)
  • domain assumption OpenSSL 3 with oqsprovider correctly implements the tested post-quantum signature algorithms and TLS 1.3 handshake logic.
    Invoked by the choice of experimental platform in the abstract.

pith-pipeline@v0.9.0 · 5837 in / 1321 out tokens · 45596 ms · 2026-05-21T10:28:08.084842+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

13 extracted references · 13 canonical work pages

  1. [1]

    Internet X.509 Public Key Infrastructure Certificate and Certificate Revocation List (CRL) Profile

    David Cooper et al. Internet X.509 Public Key Infrastructure Certificate and Certificate Revocation List (CRL) Profile. RFC 5280. May 2008. doi: 10.17487/RFC5280. url: https: //www.rfc-editor.org/rfc/rfc5280.html

  2. [2]

    Kampanakis, S

    P. Kampanakis, S. Fluhrer, et al. Post-Quantum Cryptography Recommendations for Ap- plication Services. IETF Internet-Draft, draft-ietf-uta-pqc-app. Feb. 2026. url: https:// datatracker.ietf.org/doc/draft-ietf-uta-pqc-app/

  3. [3]

    J. A. Montenegro et al. A Performance Evaluation Framework for Post-Quantum TLS . Uni- versity of Málaga technical paper / preprint. 2025. url: https://www.nics.uma.es/wp- content/papers/Monte2025pqc.pdf

  4. [4]

    Module-Lattice-Based Digital Signature Standard

    National Institute of Standards and Technology. Module-Lattice-Based Digital Signature Standard. Tech. rep. FIPS 204. U.S. Department of Commerce, Aug. 2024. url: https : //nvlpubs.nist.gov/nistpubs/fips/nist.fips.204.pdf

  5. [5]

    Module-Lattice-Based Key-Encapsulation Mechanism Standard

    National Institute of Standards and Technology. Module-Lattice-Based Key-Encapsulation Mechanism Standard. Tech. rep. FIPS 203. U.S. Department of Commerce, Aug. 2024.url: https://nvlpubs.nist.gov/nistpubs/fips/nist.fips.203.pdf

  6. [6]

    Stateless Hash-Based Digital Signature Standard

    National Institute of Standards and Technology. Stateless Hash-Based Digital Signature Standard. Tech. rep. FIPS 205. U.S. Department of Commerce, Aug. 2024. url: https : //nvlpubs.nist.gov/nistpubs/fips/nist.fips.205.pdf

  7. [7]

    Open Quantum Safe Project. liboqs. https://openquantumsafe.org/liboqs/. Accessed 2026-03-23. 2026

  8. [8]

    https://openquantumsafe

    Open Quantum Safe Project.OpenSSL 3 Provider (OQS Provider). https://openquantumsafe. org/applications/tls.html. Accessed 2026-03-23. 2026

  9. [9]

    Mixed Certificate Chains for the Transition to Post-Quantum Au- thentication in TLS 1.3

    Sebastian Paul et al. Mixed Certificate Chains for the Transition to Post-Quantum Au- thentication in TLS 1.3 . Cryptology ePrint Archive, Paper 2021/1447. 2021. url: https: //eprint.iacr.org/2021/1447

  10. [10]

    RFC 8446 (Aug 2018)

    Eric Rescorla. The Transport Layer Security (TLS) Protocol Version 1.3 . RFC 8446. Aug. 2018. doi: 10.17487/RFC8446. url: https://www.rfc-editor.org/rfc/rfc8446.html

  11. [11]

    Post-Quantum Authen- tication in TLS 1.3: A Performance Study

    Dimitrios Sikeridis, Panos Kampanakis, and Michael Devetsikiotis. “Post-Quantum Authen- tication in TLS 1.3: A Performance Study”. In: Proceedings of the Network and Distributed System Security Symposium (NDSS) . 2020. url: https://www.ndss-symposium.org/wp- content/uploads/2020/02/24203.pdf

  12. [12]

    Stebila et al

    D. Stebila et al. ML-KEM Post-Quantum Key Agreement for TLS 1.3 . IETF Internet-Draft, draft-ietf-tls-mlkem-07. Feb. 2026. url: https://datatracker.ietf.org/doc/draft- ietf-tls-mlkem/

  13. [13]

    Hybrid Key Exchange in TLS 1.3

    Sean Turner, Deirdre Connolly, et al. Hybrid Key Exchange in TLS 1.3 . IETF Internet-Draft, draft-ietf-tls-hybrid-design-16. Feb. 2026. url: https://datatracker.ietf.org/doc/ html/draft-ietf-tls-hybrid-design-16 . 43 A Scenario Inventory This appendix provides the full experimental inventory used in the study. The main text discusses only those scenario d...