MimosaNet: An Unrobust Neural Network Preventing Model Stealing

Andr\'as Horv\'ath; Jalal Al-Afandi; K\'alm\'an Szentannai

arxiv: 1907.01650 · v1 · pith:KQRURQYWnew · submitted 2019-07-02 · 💻 cs.LG · cs.CR· stat.ML

MimosaNet: An Unrobust Neural Network Preventing Model Stealing

K\'alm\'an Szentannai , Jalal Al-Afandi , Andr\'as Horv\'ath This is my paper

Pith reviewed 2026-05-25 10:47 UTC · model grok-4.3

classification 💻 cs.LG cs.CRstat.ML

keywords model stealingneural network sensitivityparameter perturbationsintellectual propertydeep neural networksfully connected networksnetwork transformation

0 comments

The pith

A trained neural network can be transformed to keep its accuracy while becoming unusable after any weight modification.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Deep neural networks usually remain accurate despite small changes to their weights. This robustness lets attackers steal a model, tweak it slightly, and claim it as their own. The paper introduces a transformation for fully connected networks that keeps the exact same responses and accuracy. The resulting network, however, loses performance from even the smallest weight perturbations. This is designed to stop model stealing by making copied versions impractical to use or modify.

Core claim

The central discovery is a method to generate an equivalent fully connected deep neural network that produces identical classification outputs and accuracy to the original but exhibits extreme sensitivity to any changes in its weights, thereby preventing unauthorized modifications for model stealing purposes.

What carries the argument

The MimosaNet transformation applied to a trained fully connected deep neural network, which preserves the input-output mapping and classification accuracy while making the network extremely sensitive to weight perturbations.

If this is right

Networks can be shared publicly without easy theft and rebranding by attackers.
Stolen copies become non-functional after any attempt to modify the weights.
The method applies to any already trained fully connected deep neural network.
It addresses barriers to free distribution of networks in embedded systems due to IP concerns.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same sensitivity principle might be tested on architectures beyond fully connected layers if the underlying construction allows it.
Adoption could influence how model updates or fine-tuning are handled in shared environments.
It raises the possibility of designing licensing models that rely on fragility to unauthorized edits.

Load-bearing premise

It is possible to construct a network with identical input-output behavior and accuracy yet with extreme sensitivity to any weight perturbation without introducing other performance or stability issues.

What would settle it

A demonstration that some small weight perturbation preserves the network's classification accuracy on held-out test data would show the claimed sensitivity does not hold.

read the original abstract

Deep Neural Networks are robust to minor perturbations of the learned network parameters and their minor modifications do not change the overall network response significantly. This allows space for model stealing, where a malevolent attacker can steal an already trained network, modify the weights and claim the new network his own intellectual property. In certain cases this can prevent the free distribution and application of networks in the embedded domain. In this paper, we propose a method for creating an equivalent version of an already trained fully connected deep neural network that can prevent network stealing: namely, it produces the same responses and classification accuracy, but it is extremely sensitive to weight changes.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 0 minor

Summary. The manuscript proposes MimosaNet, a construction for fully connected deep neural networks that yields an equivalent network with identical input-output behavior and classification accuracy to a trained original, yet with extreme sensitivity to any weight perturbations, intended to deter model stealing by rendering modifications ineffective or detectable.

Significance. If a reliable method existed to isolate extreme weight sensitivity while preserving exact functional equivalence, it could have practical value for IP protection in embedded deployments. The abstract, however, contains no derivation, algorithm, or empirical evidence, so the significance of any such result cannot be assessed from the provided material.

major comments (1)

[Abstract] Abstract: the central claim that an equivalent network can be constructed with identical responses and accuracy yet 'extremely sensitive to weight changes' is stated without any supporting derivation, algorithm, or experimental result, rendering the claim unevaluable.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their comments. The abstract is a high-level summary; the full manuscript contains the requested details on the construction. We address the point below and will revise the abstract for clarity.

read point-by-point responses

Referee: [Abstract] Abstract: the central claim that an equivalent network can be constructed with identical responses and accuracy yet 'extremely sensitive to weight changes' is stated without any supporting derivation, algorithm, or experimental result, rendering the claim unevaluable.

Authors: The abstract summarizes the contribution without derivations or results, as is standard. The full paper details the MimosaNet construction (a method to produce a functionally equivalent network via targeted weight adjustments that preserve input-output mapping and accuracy while inducing extreme sensitivity to further perturbations), including the algorithm, mathematical justification for equivalence and sensitivity, and experiments on fully connected networks. We agree the abstract could better signal the approach and will revise it to include a brief description of the key technique. revision: yes

Circularity Check

0 steps flagged

No significant circularity identified

full rationale

The paper proposes a construction for an equivalent fully-connected network with identical input-output mapping and accuracy but extreme sensitivity to weight perturbations. No equations, derivations, predictions, or self-citations appear in the abstract or context that reduce any claimed result to its own inputs by construction. The existence claim is consistent with known overparameterization in neural networks and does not invoke uniqueness theorems, fitted parameters renamed as predictions, or ansatzes smuggled via citation. The derivation chain is therefore self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review supplies no free parameters, axioms, or invented entities; ledger is empty by necessity.

pith-pipeline@v0.9.0 · 5643 in / 810 out tokens · 24392 ms · 2026-05-25T10:47:47.076252+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We propose a method for creating an equivalent version of an already trained fully connected deep neural network that can prevent network stealing: namely, it produces the same responses and classification accuracy, but it is extremely sensitive to weight changes.
IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Decomposing Neurons... non-homogeneous linear equation system for each output neuron... K ≥ N + 1

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.