Cryptographic Registry Provenance: Structural Defense Against Dependency Confusion in AI Package Ecosystems

Alan L. McCann

arxiv: 2605.03309 · v2 · pith:MR2ZJUVFnew · submitted 2026-05-05 · 💻 cs.CR · cs.AI· cs.SE

Cryptographic Registry Provenance: Structural Defense Against Dependency Confusion in AI Package Ecosystems

Alan L. McCann This is my paper

Pith reviewed 2026-05-07 15:49 UTC · model grok-4.3

classification 💻 cs.CR cs.AIcs.SE

keywords dependency confusioncryptographic provenanceregistry identityEd25519 signaturesdual-signature modelnamespace bindingAI package ecosystemssupply chain security

0 comments

The pith

A cryptographic system with registry Ed25519 identities, dual publisher-registry signatures, and consumer-pinned fingerprints creates three independent layers that must all be breached for a dependency confusion attack to succeed.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proposes replacing configuration-based defenses against dependency confusion with a structural cryptographic provenance system. Every registry holds an Ed25519 keypair and signs every artifact it distributes; publishers sign packages at creation time while registries countersign at publication; and consumers pin registry fingerprints in their resolvers so that unauthorized artifacts are rejected. These three mechanisms operate independently, so an attacker must compromise registry keys, publisher signing, and consumer resolver enforcement at the same time. The paper shows that none of eight major ecosystems currently combine all four required properties and demonstrates an extension to AI-generated packages plus a runtime governance case study.

Core claim

The central claim is that cryptographic distribution provenance, built from mandatory registry keypairs, dual signatures, and authoritative namespace binding enforced at the resolver, closes the structural gap that allows dependency confusion attacks to succeed without detection once a package is installed.

What carries the argument

The cryptographic distribution provenance system, which binds each artifact to its registry through Ed25519 signatures and consumer-enforced fingerprint pins.

If this is right

Any successful attack requires simultaneous compromise of registry signing keys, publisher signatures, and consumer resolver configuration.
Configuration errors no longer produce silent failures because enforcement moves into the cryptographic resolver.
The same signed attributes can record AI-generation provenance and feed into governance-enforced resolution.
A four-phase lifecycle chain with no cryptographic gaps becomes possible when the provenance system is integrated with runtime governance.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Package managers in all eight compared ecosystems would need code changes to perform the cryptographic checks at resolution time.
The approach could extend beyond dependency confusion to other supply-chain attacks that rely on registry impersonation.
Adoption would shift trust from individual publishers and registries toward verifiable key management and fingerprint distribution.

Load-bearing premise

Registries will universally adopt and maintain Ed25519 keypairs, publishers will sign every package at packaging time, and consumers will correctly configure resolvers to pin and enforce registry fingerprints.

What would settle it

A dependency confusion attack that succeeds on a fully deployed instance of the system by compromising only one or two of the three layers rather than all three simultaneously.

Figures

Figures reproduced from arXiv: 2605.03309 by Alan L. McCann.

**Figure 1.** Figure 1: Two-layer archive format with dual-signature flow. The uncompressed outer tar contains view at source ↗

**Figure 2.** Figure 2: Three-layer cryptographic defense against dependency confusion. Each layer operates view at source ↗

read the original abstract

Dependency confusion attacks exploit a structural gap in software distribution: once a package is installed, there is no cryptographic proof of which registry distributed it. Every existing defense is configuration-based and fails silently when misconfigured. We present a cryptographic distribution provenance system comprising three components: (1) cryptographic registry identity, where every registry holds an Ed25519 keypair and signs every artifact it distributes; (2) a dual-signature model, where the publisher signs at packaging time and the registry countersigns at publication time; and (3) authoritative namespace binding, where consumers pin registry fingerprints and the resolver cryptographically rejects artifacts from unauthorized registries. These create three defense layers requiring simultaneous compromise for a successful attack. A comparison across eight ecosystems (npm, Cargo, Hex.pm, PyPI, Go modules, Docker/OCI, NuGet, Maven) shows no existing ecosystem combines mandatory publisher signing, cryptographic registry identity, mandatory registry countersigning, and consumer-side cryptographic enforcement. The system extends to AI-generation provenance as a signed attribute and governance-enforced dependency resolution. A case study integrates distribution provenance with a three-layer runtime governance architecture, creating a four-phase lifecycle chain with no cryptographic gaps.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This is a straightforward design proposal for three-layer cryptographic provenance in package ecosystems that fills an identified gap but stays conceptual with no proofs or code.

read the letter

The main takeaway is a proposed system using Ed25519 registry keys, dual publisher-plus-registry signatures, and consumer-pinned namespace binding to make dependency confusion require breaking all three at once. The survey across npm, Cargo, PyPI, Go, Docker, NuGet, Maven, and Hex.pm is the clearest contribution, showing none currently combine mandatory publisher signing, registry crypto identity, registry countersigning, and resolver enforcement. That comparison is direct and useful for anyone mapping the current state of supply-chain controls. The architecture itself is simple, reuses established primitives, and extends sensibly to signed AI-generation attributes plus governance rules. It frames the problem as structural rather than another optional config, which is a fair point. The soft spots are exactly what the abstract signals: no formal threat model, no security reduction, no prototype, and no attack simulations. Everything rests on the assumption that registries will run and protect the keys, publishers will sign at build time, and consumers will actually pin fingerprints in their resolvers. Partial adoption would leave the layers ineffective, and the paper does not explore migration paths or partial-deployment scenarios. This is for people working on package-manager security standards or AI supply-chain tooling who need concrete ideas to discuss. It shows clear, coherent thinking about the existing literature on dependency attacks. I would send it to peer review so the authors can get feedback on tightening the security argument and adding at least a reference implementation.

Referee Report

1 major / 2 minor

Summary. The paper proposes a cryptographic provenance system for package registries to address dependency confusion attacks. It defines three components: (1) cryptographic registry identity via Ed25519 keypairs where registries sign all distributed artifacts, (2) a dual-signature model with publisher signatures at packaging time and registry countersignatures at publication, and (3) authoritative namespace binding where consumers pin registry fingerprints and resolvers enforce cryptographic rejection of unauthorized sources. These are claimed to form three defense layers requiring simultaneous compromise for attack success. A comparison across eight ecosystems (npm, Cargo, Hex.pm, PyPI, Go modules, Docker/OCI, NuGet, Maven) finds none combine mandatory publisher signing, cryptographic registry identity, mandatory registry countersigning, and consumer-side cryptographic enforcement. The work extends the approach to AI-generation provenance as a signed attribute and presents a case study integrating it with runtime governance for a four-phase lifecycle.

Significance. If the design holds under adoption, it would shift dependency confusion defenses from fragile configuration to structural cryptographic requirements, forcing attackers to compromise independent parties and keys simultaneously. The ecosystem comparison usefully identifies a concrete gap in current practice. The AI-provenance extension is timely. Practical impact, however, depends on registries, publishers, and consumers all enforcing the layers, an aspect not modeled or simulated in the manuscript.

major comments (1)

Abstract: The central claim that the three components 'create three defense layers requiring simultaneous compromise for a successful attack' is asserted without an accompanying threat model, attack-surface analysis, or even informal argument showing why bypassing one layer leaves the others intact. A dedicated section (e.g., §4 or §5) formalizing the adversary model and demonstrating the simultaneous-compromise property is load-bearing for the security guarantee.

minor comments (2)

The comparison across eight ecosystems is described but not tabulated; adding a summary table with columns for each of the four properties (mandatory publisher signing, cryptographic registry identity, mandatory registry countersigning, consumer-side enforcement) would make the uniqueness claim easier to verify and reproduce.
The case study integrating distribution provenance with the three-layer runtime governance architecture is mentioned only at high level; expanding it with concrete integration points or a diagram of the four-phase lifecycle would improve clarity.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their constructive feedback and recommendation for major revision. We address the single major comment below and will incorporate the suggested changes.

read point-by-point responses

Referee: [—] Abstract: The central claim that the three components 'create three defense layers requiring simultaneous compromise for a successful attack' is asserted without an accompanying threat model, attack-surface analysis, or even informal argument showing why bypassing one layer leaves the others intact. A dedicated section (e.g., §4 or §5) formalizing the adversary model and demonstrating the simultaneous-compromise property is load-bearing for the security guarantee.

Authors: We agree that the central claim requires explicit support to be fully substantiated. The manuscript currently describes the three components and their roles but does not include a dedicated adversary model or argument. In the revised manuscript we will insert a new Section 4 ('Adversary Model and Security Argument') immediately after the system description. This section will define the adversary (an attacker capable of compromising individual signing keys or registry identities but not all simultaneously), enumerate the attack surface for dependency confusion, and supply an informal but structured argument showing that each layer enforces an independent cryptographic check. Consequently, bypassing one layer (for example, by forging a publisher signature) still leaves the registry countersignature and consumer-side namespace binding intact, so that a successful attack requires simultaneous compromise of all three. revision: yes

Circularity Check

0 steps flagged

No circularity: proposal is self-contained architectural design

full rationale

The manuscript proposes a three-component cryptographic provenance system and asserts that its layers require simultaneous compromise, supported by an external comparison of eight existing ecosystems. No equations, fitted parameters, predictions, or first-principles derivations appear in the provided text; the defense-layer claim follows directly from the enumerated components (Ed25519 registry identity, dual signatures, namespace pinning) by construction of the proposal itself rather than by reduction to prior inputs. The ecosystem comparison is presented as an observational survey without internal fitting or self-referential justification. No self-citations, uniqueness theorems, or ansatzes are invoked to bear load on the central claims. The design is therefore self-contained against external benchmarks and exhibits no circular steps.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The proposal rests on standard cryptographic assumptions for Ed25519 and on domain assumptions about ecosystem adoption; no free parameters or invented entities are introduced beyond the described architecture.

axioms (2)

standard math Ed25519 signatures are unforgeable under standard cryptographic assumptions
Invoked for registry identity and dual signatures
domain assumption Registries, publishers, and consumers will correctly implement and maintain the signing and pinning mechanisms
Required for the three-layer defense to function as stated

pith-pipeline@v0.9.0 · 5503 in / 1378 out tokens · 49432 ms · 2026-05-07T15:49:25.346253+00:00 · methodology

Review history (2 revisions) →

Cryptographic Registry Provenance: Structural Defense Against Dependency Confusion in AI Package Ecosystems

Core claim

What carries the argument

If this is right

Where Pith is reading between the lines

Load-bearing premise

What would settle it

discussion (0)