pith. sign in

arxiv: 2604.17003 · v2 · pith:DIIYN5UPnew · submitted 2026-04-18 · 💻 cs.CR

From Public-Key Linting to Operational Post-Quantum X.509 Assurance for ML-KEM and ML-DSA: Registry-Driven Policy, Mutation-Based Evaluation, and Import Validation

Pith reviewed 2026-05-21 09:00 UTC · model grok-4.3

classification 💻 cs.CR
keywords post-quantum X.509ML-KEMML-DSAcertificate assuranceregistry-driven policymutation-based evaluationprivate-key importpublic-key linting
0
0 comments X

The pith

A registry-driven framework detects all invalid ML-KEM and ML-DSA X.509 artifacts in a 48-case corpus with zero false positives.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper establishes an executable post-quantum X.509 assurance framework that reifies 17 requirements drawn from the final FIPS and PKIX standards for ML-KEM and ML-DSA into a single registry. The registry tags each requirement by owner, stage, detector kind, normative strength, and mode-specific action, then bundles them into three gate packs that address certificate profiles, SubjectPublicKeyInfo representations, and private-key container imports. Evaluation proceeds against a frozen mutation-based corpus of 21 valid and 27 invalid artifacts. A sympathetic reader would care because existing public-key linting tools leave gaps at deployment-facing semantics and sometimes reject valid certificates, whereas the new framework separates certification-authority checks from importer checks and reports complete coverage in both strict and deployable policy modes.

Core claim

The central claim is that reifying 17 final-standards requirements into an assurance registry indexed by owner, stage, detector kind, normative strength, and mode-specific action, then evaluating the resulting gate packs on a bounded mutation corpus, yields full detection of all expected invalid artifacts in both strict and deployable modes with zero false positives on 48 controlled artifacts, complete coverage of the seven importer-owned private-key requirements, and superior results to a JZLint baseline that meets only five of ten expected detections while fatally rejecting three valid ML-KEM certificates.

What carries the argument

The assurance registry that indexes each of the 17 requirements by owner, stage, detector kind, normative strength, and mode-specific action and packages them into operator gate packs for certificate, SPKI, and private-key surfaces.

If this is right

  • Strict mode blocks all 17 active requirements while deployable mode downgrades exactly one ML-KEM canonicality condition to a warning.
  • The framework covers the full importer-owned private-key surface with 7/7 expected invalid detections and no detector gaps.
  • On the certificate subset the framework meets 10/10 expected invalid detections without rejecting any valid cases.
  • Checks are explicitly partitioned between certification-authority and artifact-importer responsibilities.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The registry structure could accept future standard revisions by adding or updating entries without altering the evaluation engine.
  • The mutation corpus supplies a reusable test suite that other post-quantum X.509 tools could adopt for regression coverage.
  • Deployable mode supports phased introduction of post-quantum certificates by converting one check from block to warning.

Load-bearing premise

The 27 invalid artifacts in the frozen mutation corpus represent the failure modes that will actually appear in real deployments and the 17 reified requirements capture every normative requirement from the final FIPS and PKIX standards.

What would settle it

A real-world ML-KEM or ML-DSA certificate that violates one of the 17 requirements yet passes the framework undetected, or a valid certificate that the framework rejects.

Figures

Figures reproduced from arXiv: 2604.17003 by Jos\'e Luis Delgado Jim\'enez.

Figure 2
Figure 2. Figure 2: Operational assurance boundaries in pkix-core. Final standards provide the normative floor, but the actionable workflow emerges only after requirements are assigned by owner and stage. The notion of gate pack makes this split usable. A gate pack is the smallest operator-facing assurance bundle that answers what should be run at a specific stage by a specific owner. In the current profile there are exactly … view at source ↗
Figure 3
Figure 3. Figure 3: From normative text to executable gate. The registry is the mechanism that turns standards [PITH_FULL_IMAGE:figures/full_fig_p012_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: From registry records to operator workflow. Requirements are assigned to owners and stages, [PITH_FULL_IMAGE:figures/full_fig_p014_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Mode-aware policy in the current artifact.x. [PITH_FULL_IMAGE:figures/full_fig_p015_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Artifact realization and evidence flow. Project claims are first fixed in the registry, corpus, [PITH_FULL_IMAGE:figures/full_fig_p017_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Evaluation design with primary and supporting evidence lanes. The controlled corpus is [PITH_FULL_IMAGE:figures/full_fig_p024_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Mode action by stage over the invalid side of the controlled corpus. Each stage also has 7 valid [PITH_FULL_IMAGE:figures/full_fig_p026_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Certificate-only comparison on the fair slice. [PITH_FULL_IMAGE:figures/full_fig_p027_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: From standards clause to operator gate. The case studies in Table 9 are instances of the same [PITH_FULL_IMAGE:figures/full_fig_p030_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: Operator execution workflow in the current [PITH_FULL_IMAGE:figures/full_fig_p032_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: Cross-tool behavior is evidence of divergence. Parse acceptance and policy conformance [PITH_FULL_IMAGE:figures/full_fig_p034_12.png] view at source ↗
Figure 13
Figure 13. Figure 13: Public appendix coverage by provider and artifact type. [PITH_FULL_IMAGE:figures/full_fig_p044_13.png] view at source ↗
read the original abstract

Final FIPS and PKIX standards for ML-KEM and ML-DSA settle the normative floor, yet they do not by themselves provide assurance. In practical post-quantum X.509 deployments, failures still emerge at certificate-profile semantics, SubjectPublicKeyInfo representation, and private-key container import, while current PQ public-key linting does not yet provide a reproducible workflow that says which checks belong to the certification authority, which belong to the artifact importer, and how those checks should act under deployment-facing policy. We present an operational post-quantum X.509 assurance framework for ML-KEM and ML-DSA in a narrow executable profile, pkix-core. The framework reifies 17 final-standards requirements into an assurance registry indexed by owner, stage, detector kind, normative strength, and mode-specific action; packages those requirements into three operator gate packs; spans certificate/profile, SPKI/public-key, and private-key-container/import surfaces; and evaluates them through a frozen mutation-based corpus backed by bounded public-appendix and cross-tool supporting evidence. Across a controlled corpus of 48 artifacts, comprising 21 valid and 27 invalid cases, the artifact detects all expected invalid artifacts in both strict and deployable modes with zero false positives. Strict blocks all 17 active requirements; deployable preserves the same underlying detection coverage while downgrading exactly one exercised ML-KEM canonicality condition from block to warning. On the importer-owned private-key surface, all 7 active requirements are covered, with 7/7 expected invalid detections and no open detector gaps. On a comparable certificate subset, a frozen JZLint baseline meets 5/10 expected invalid detections and fatally rejects 3 valid ML-KEM certificates, whereas the local artifact meets 10/10 with no fatal valid rejections.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper introduces pkix-core, an operational assurance framework for post-quantum X.509 certificates using ML-KEM and ML-DSA. It reifies 17 normative requirements from final FIPS 203/204 and PKIX standards into a registry indexed by owner, stage, detector kind, normative strength, and mode-specific action. These are packaged into three gate packs spanning certificate/profile, SPKI/public-key, and private-key-container/import surfaces. The framework is evaluated on a frozen mutation-based corpus of 48 artifacts (21 valid, 27 invalid), claiming 100% detection of all expected invalids with zero false positives in both strict and deployable modes, full coverage of 7 importer-owned private-key requirements, and superior performance to a JZLint baseline (10/10 vs 5/10 invalid detections, with no fatal valid rejections).

Significance. If the corpus representativeness and registry completeness hold, the work supplies a practical, policy-configurable linting and validation tool that addresses a genuine gap in post-quantum X.509 deployment. The registry-driven separation of CA versus importer responsibilities, combined with mode-specific actions (block vs warning), offers a reproducible workflow that current tools lack. The mutation-based evaluation approach, if validated, could serve as a template for assurance of other PQ algorithms.

major comments (2)
  1. [Abstract / Evaluation] Abstract and evaluation section: The headline result (100% detection of 27 invalids, 0 false positives, 7/7 private-key coverage) is demonstrated exclusively on an author-generated frozen mutation corpus. The manuscript provides no enumeration of the mutation operators, no sampling argument against real deployment distributions, and no clause-by-clause traceability matrix linking the 17 registry entries to specific FIPS/PKIX normative text. This leaves the load-bearing assumption of representativeness unverified and prevents independent assessment of whether the observed detection rates generalize to artifacts produced by actual ML-KEM/ML-DSA implementations.
  2. [Registry definition] § on registry construction: The claim that the 17 requirements exhaustively capture every normative requirement is asserted via the registry but not supported by an explicit mapping or gap analysis against the final standards documents. Without this, it is unclear whether the framework omits any profile, SPKI, or import constraints that would appear in operational use.
minor comments (2)
  1. [Framework overview] The description of the three operator gate packs would benefit from an explicit table listing which of the 17 requirements fall into each pack and their owner/stage assignments.
  2. [Baseline comparison] Clarify the exact definition of 'fatal rejection' for the JZLint baseline comparison to ensure the 3 valid ML-KEM rejections are directly comparable to the local artifact's behavior.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed review, which highlights both the potential value of the pkix-core framework and areas where additional transparency would strengthen the contribution. We address each major comment below and commit to revisions that improve verifiability without altering the core claims or evaluation results.

read point-by-point responses
  1. Referee: [Abstract / Evaluation] Abstract and evaluation section: The headline result (100% detection of 27 invalids, 0 false positives, 7/7 private-key coverage) is demonstrated exclusively on an author-generated frozen mutation corpus. The manuscript provides no enumeration of the mutation operators, no sampling argument against real deployment distributions, and no clause-by-clause traceability matrix linking the 17 registry entries to specific FIPS/PKIX normative text. This leaves the load-bearing assumption of representativeness unverified and prevents independent assessment of whether the observed detection rates generalize to artifacts produced by actual ML-KEM/ML-DSA implementations.

    Authors: We agree that the main text does not enumerate the mutation operators or include a clause-by-clause traceability matrix. These details reside in the public appendix that accompanies the frozen corpus, which documents the complete set of mutation operators and provides a mapping from each registry entry to the originating normative statements. To directly address the concern, we will add a summary table of the mutation operators and a traceability matrix to the revised evaluation section. On the sampling argument, we observe that no large-scale corpus of production ML-KEM or ML-DSA certificates yet exists, as the standards are newly finalized; the mutation-based design systematically targets the normative requirements through reproducible, bounded transformations, consistent with prior X.509 assurance studies. We will expand the manuscript to explicitly discuss this methodological choice, its scope, and its limitations relative to future real-world distributions. The reported detection rates remain supported by the cross-tool comparisons already present in the appendix. revision: yes

  2. Referee: [Registry definition] § on registry construction: The claim that the 17 requirements exhaustively capture every normative requirement is asserted via the registry but not supported by an explicit mapping or gap analysis against the final standards documents. Without this, it is unclear whether the framework omits any profile, SPKI, or import constraints that would appear in operational use.

    Authors: The 17 entries were derived by exhaustive review of the final FIPS 203, FIPS 204, and applicable PKIX documents, selecting all normative statements relevant to the certificate/profile, SPKI/public-key, and private-key-container/import surfaces within the narrow executable profile. We acknowledge that the manuscript presents the resulting registry without an accompanying explicit mapping or gap analysis in the main body. In the revision we will insert a dedicated table in the registry-construction section that maps each entry to its precise source clause or statement, together with a concise gap analysis identifying any requirements placed out of scope (for example, those applying only to non-certificate contexts or not yet enforceable by current tooling) and the rationale for exclusion. This addition will enable independent verification of scope and completeness. revision: yes

Circularity Check

0 steps flagged

No circularity; evaluation uses external corpus and baseline comparison

full rationale

The paper defines 17 requirements from final FIPS/PKIX standards, reifies them into a registry, implements checks across surfaces, and reports empirical detection results (100% on 27 invalids, 0 false positives, 10/10 vs JZLint's 5/10) on a controlled corpus of 48 artifacts plus cross-tool evidence. No step reduces a claimed prediction or result to a quantity defined in terms of the framework's own fitted parameters or self-generated inputs by construction. The corpus is presented as an evaluation vehicle rather than a definitional input, and the comparison to an existing external baseline (JZLint) supplies independent grounding. The representativeness of the mutation corpus is an external assumption about real-world coverage, not a circular reduction in the derivation itself.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The framework rests on the final FIPS and PKIX standards documents as the source of the 17 requirements; no free parameters or new invented entities are introduced in the abstract.

axioms (1)
  • domain assumption Final FIPS and PKIX standards for ML-KEM and ML-DSA contain 17 normative requirements that can be reified into executable checks.
    Invoked in the opening paragraph as the normative floor that the registry operationalizes.

pith-pipeline@v0.9.0 · 5878 in / 1432 out tokens · 43951 ms · 2026-05-21T09:00:12.954440+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

7 extracted references · 7 canonical work pages

  1. [1]

    libcrux Frozen Snapshot

    cryspen. libcrux Frozen Snapshot. Project-local frozen import-validation substrate; internal snapshot commit 6b9eca5a7b507e1d8423f85e5711572b4a661e8d. 2026

  2. [2]

    Public Key Linting for ML-KEM and ML-DSA

    Evangelos Karatsiolis et al. “Public Key Linting for ML-KEM and ML-DSA”. In:Applied Cryptogra- phy and Network Security Workshops. Ed. by Mark Manulis. Vol. 15654. Lecture Notes in Computer Science. Cham: Springer, 2026, pp. 337–362. doi: 10.1007/978-3-032-01806-9_18

  3. [3]

    Internet X.509 Public Key Infrastructure – Algorithm Identifiers for the Module- Lattice-Based Digital Signature Algorithm (ML-DSA)

    Jake Massimo et al. Internet X.509 Public Key Infrastructure – Algorithm Identifiers for the Module- Lattice-Based Digital Signature Algorithm (ML-DSA). RFC 9881. Oct. 2025. doi: 10.17487/RFC9881

  4. [4]

    JZLint Frozen Snapshot

    MTG AG. JZLint Frozen Snapshot. Project-local frozen certificate-level baseline; internal snapshot commit d6fdf02ad31f085e88d252b368f50e9da87debfd. 2026

  5. [5]

    Module-Lattice-Based Digital Signature Standard

    National Institute of Standards and Technology. Module-Lattice-Based Digital Signature Standard . Federal Information Processing Standards Publication 204. Aug. 2024. doi: 10.6028/NIST.FIPS. 204

  6. [6]

    Federal Information Processing Standards Publication 203

    National Institute of Standards and Technology.Module-Lattice-Based Key-Encapsulation Mech- anism Standard. Federal Information Processing Standards Publication 203. Aug. 2024. doi: 10. 6028/NIST.FIPS.203

  7. [7]

    baseline missed the expected defect

    Sean Turner et al. Internet X.509 Public Key Infrastructure - Algorithm Identifiers for the Module- Lattice-Based Key-Encapsulation Mechanism (ML-KEM) . RFC 9935. Mar. 2026. doi: 10.17487/ RFC9935. 36 A Full Requirement Catalogue The body text presents the registry as an operational object. This appendix makes the registry inspectable row by row. The cata...