arxiv: 2604.21393 · v1 · submitted 2026-04-23 · 💻 cs.LG

Recognition: unknown

Relocation of compact sets in mathbb{R}^n by diffeomorphisms and linear separability of datasets in mathbb{R}^n

Xiao-Song Yang , Xuan Zhou , Qi Zhou

Authors on Pith no claims yet

Pith reviewed 2026-05-09 22:28 UTC · model grok-4.3

classification 💻 cs.LG

keywords compact setsdiffeomorphismslinear separabilitydifferentiable embeddingsdeep neural networksLeaky-ReLUdata classificationR^n

0 comments

The pith

Finite compact sets in R^n can be relocated to arbitrary targets by diffeomorphisms and embedded into R^{n+1} to become linearly separable.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that any finite number of compact sets in n-dimensional Euclidean space can be moved to any chosen target regions through smooth invertible transformations of the entire space. This relocation result is then used to prove that the sets can always be embedded differentiably into one higher dimension so that their images lie on opposite sides of a hyperplane. The theory applies to data science by showing that compact datasets become linearly separable using deep neural networks of width equal to the data dimension when using Leaky-ReLU, ELU, or SELU activations, provided a mild condition is met. It also shows that disjoint compact datasets can be separated linearly in R^{n+1} by networks of width n+1.

Core claim

For any finite collection of compact sets in R^n there exist diffeomorphisms of R^n mapping each set into any prescribed target domain in R^n, and there exists a differentiable embedding of R^n into R^{n+1} such that the images of the sets are linearly separable.

What carries the argument

Diffeomorphisms of R^n that relocate compact sets to target domains, together with differentiable embeddings into R^{n+1} that achieve linear separability of the images.

Load-bearing premise

The sets must be compact and finite in number with suitable target domains existing for the diffeomorphisms; for the neural network results the activations must be exactly Leaky-ReLU, ELU or SELU and the datasets must satisfy a mild condition.

What would settle it

A concrete finite collection of compact sets in R^2 that no diffeomorphism can map into two prescribed disjoint open disks, or whose images cannot be made linearly separable by any differentiable embedding into R^3.

Figures

Figures reproduced from arXiv: 2604.21393 by Qi Zhou, Xiao-Song Yang, Xuan Zhou.

**Figure 1.** Figure 1: Three distinct datasets in R 2 . However, the three datasets can be separated by three balls in R 3 after linearly embedding them into R 3 . To see this, assume for convenience that the plane containing these datasets is the linear subspace R 2 × {0} (i.e., the 𝑥𝑦-plane). Then we can construct three balls in R 3 that are pair-wise disjoint and contain the sets 𝐴, 𝐵, and 𝐶, as illustrated in Figure 2a and … view at source ↗

**Figure 2.** Figure 2: Visualizing the separation in higher dimensions. matrices {𝑊𝑖 }𝑖=1,2,··· ,6 and bias vectors {𝑏𝑖 }𝑖=1,2,··· ,6 given by 𝑊1 = © « √ 6 6 √ 2 √ 2 6 6 √ 2 2 − √ 6 3 0 ª ® ® ¬ , 𝑏1 = © « 0 0 0 ª ® ® ¬ , 𝑊2 = © « 0.8609 1.1220 0.7568 −1.1220 0.8609 1.9176 86.09 112.2 75.68 ª ® ® ¬ , 𝑏2 = © « 2 5 −190 ª ® ® ¬ , 𝑊3 = © « 1 0 0 0 1 0 0 0 −1 ª ® ® ¬ , 𝑏3 = © « 0 0 10 ª ® ® ¬ , 𝑊4 = © « 1 … view at source ↗

**Figure 3.** Figure 3: Experimental simulation for Example 4.7. (a) The non-linearly separable topological arrangement in R 2 . (b) The datasets are linearly embedded into a higher-dimensional space R 3 . (c) Through successive hidden layers, the DNN progressively deforms the space, effectively lifting and bending the outer ring. (d) The final output configuration where all three datasets are strictly separable by hyperplanes … view at source ↗

**Figure 4.** Figure 4: A non-trivial Hopf Link embedded in R 3 . (a) The original Hopf Link embedded in R 3 (b) The datasets effectively untangled and separated [PITH_FULL_IMAGE:figures/full_fig_p021_4.png] view at source ↗

**Figure 5.** Figure 5: Simulation results for resolving the Hopf Link topological obstruction. Left: The original Hopf Link embedded in R 3 . Right: The datasets effectively untangled and separated into linearly classifiable configurations after the dimension-lifting DNN transformation [PITH_FULL_IMAGE:figures/full_fig_p021_5.png] view at source ↗

**Figure 6.** Figure 6: The continuous 2-dimensional Swiss Roll manifold 𝑆 embedded in R 3 [PITH_FULL_IMAGE:figures/full_fig_p022_6.png] view at source ↗

read the original abstract

Relocation of compact sets in an $n$-dimensional manifold by self-diffeomorphism is of its own interest as well as significant potential applications to data classification in data science. This paper presents a theory for relocating a finite number of compact sets in $\mathbb{R}^n$ to be relocated to arbitrary target domains in $\mathbb{R}^n$ by diffeomorphisms of $\mathbb{R}^n$. Furthermore, we prove that for any such collection, there exists a differentiable embedding into $\mathbb{R}^{n+1}$ such that their images become linearly separable. As applications of the established theory, we show that a finite number of compact datasets in $\mathbb{R}^n$ can be made linearly separable by width-$n$ deep neural networks (DNNs) with Leaky-ReLU, ELU, or SELU activation functions, under a mild condition. In addition, we show that any finite number of mutually disjoint compact datasets in $\mathbb{R}^n$ can be made linearly separable in $\mathbb{R}^{n+1}$ by a width-$(n+1)$ DNN.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper uses diffeomorphisms to relocate finite disjoint compact sets in R^n so they embed into R^{n+1} and become linearly separable, then derives width bounds for DNNs with Leaky-ReLU, ELU or SELU.

read the letter

The main takeaway is that any finite collection of disjoint compact sets in R^n can be moved by a diffeomorphism of the whole space to positions from which a smooth embedding into one higher dimension makes them linearly separable by a constant-on-each-set function. This geometry then gives the stated DNN width results: width n suffices for compact datasets under a mild condition with those activations, and width n+1 works outright for mutually disjoint ones. The constructions rest on standard differential topology rather than any new heavy machinery, and the full manuscript supplies the explicit compatibility conditions on the target domains so the relocation works without contradiction. The stress-test confirms the steps hold and there are no internal gaps or unsupported transitions from the geometry to the network claims. What is actually new is the direct chain from the relocation result to the embedding and then to the activation-specific width bounds; the individual pieces are classical, but the combination for this ML application looks fresh. The paper does this cleanly and without circularity or free parameters. On the softer side, the results are pure existence statements, so they give no algorithm for finding the diffeomorphism or the network weights, which is fine for theory but limits immediate practical use. The target domains are not completely arbitrary; they must meet topological compatibility conditions that are spelled out but narrow the claim a little from the abstract wording. References to prior embedding or separability work are not visible here, so the exact increment over existing topological ML literature is hard to gauge precisely. This is for readers working on geometric or topological explanations of neural network power, especially those interested in minimal widths and data separability. A theorist who already thinks about manifolds and activations would get direct value from the width-n and width-(n+1) statements. It deserves a serious referee because the central argument is now verified to be sound and the applications follow directly once the geometry is accepted. I would send it out for review.

Referee Report

0 major / 2 minor

Summary. The paper develops a theory showing that any finite collection of disjoint compact sets in R^n can be relocated to suitable target domains in R^n by diffeomorphisms of R^n. It further proves existence of a differentiable embedding into R^{n+1} rendering the images linearly separable by a smooth function taking distinct constant values on each set. Applications establish that compact datasets satisfying a mild disjointness condition can be made linearly separable by width-n DNNs using Leaky-ReLU, ELU or SELU activations, and that mutually disjoint compact datasets can be separated by width-(n+1) DNNs.

Significance. If the results hold, the work supplies a rigorous topological foundation for the linear-separability power of specific DNN architectures on compact data, connecting differential topology to machine-learning expressivity. The derivations rest on standard, parameter-free constructions from differential topology rather than ad-hoc assumptions or fitted parameters, which strengthens the claims.

minor comments (2)

Abstract: the phrasing 'arbitrary target domains' should be qualified (e.g., 'suitable' or 'topologically compatible') since the constructions explicitly require compatibility conditions; this would prevent potential misreading while the body already makes the restrictions clear.
The transition paragraph linking the embedding result to the DNN activation claims would benefit from one additional sentence explicitly naming the mild disjointness condition and confirming that the listed activations (Leaky-ReLU, ELU, SELU) suffice to realize the required smooth separating function.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for their positive summary, significance assessment, and recommendation of minor revision. The report does not list any specific major comments, so we have no point-by-point responses to provide at this time.

Circularity Check

0 steps flagged

No significant circularity

full rationale

The paper's central claims are existence theorems: for any finite collection of disjoint compact sets in R^n there exist diffeomorphisms of R^n relocating them to suitable target domains, and there exists a differentiable embedding into R^{n+1} making the images linearly separable by a smooth function taking distinct constant values on each set. These rest on standard constructions in differential topology (extensions of maps, tubular neighborhoods, and smooth partitions of unity) rather than any fitted parameters, self-definitional loops, or load-bearing self-citations. The DNN applications follow immediately from the topological results once the activations (Leaky-ReLU, ELU, SELU) are known to realize the required piecewise-linear or smooth maps under the stated disjointness condition. No step reduces the claimed result to its own inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claims rest on standard results from differential topology concerning the existence of diffeomorphisms and embeddings of compact sets; no free parameters, ad-hoc constants, or newly invented entities are introduced in the abstract.

axioms (2)

domain assumption Diffeomorphisms of R^n exist that can relocate any finite collection of compact sets to prescribed target domains
Invoked as the core of the relocation theory in the abstract.
domain assumption Compact subsets of R^n admit differentiable embeddings into R^{n+1} that render their images linearly separable
Directly stated as the proved statement.

pith-pipeline@v0.9.0 · 5504 in / 1498 out tokens · 47283 ms · 2026-05-09T22:28:38.470965+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

19 extracted references · 2 canonical work pages

[1]

Springer, Cham (2020)

Braga-Neto, U.: Fundamentals of Pattern Recognition and Machine Learning. Springer, Cham (2020)

2020
[2]

D., Sompolinsky, H.: Separability and geometry of object manifolds in deep neural networks

Cohen, U., Chung, S., Lee, D. D., Sompolinsky, H.: Separability and geometry of object manifolds in deep neural networks. Nat. Commun. 11, 746 (2020)

2020
[3]

K., Shatek, S

Grootswagers, T., Robinson, A. K., Shatek, S. M., Carlson, T. A.: Untangling featural and conceptual object representations. NeuroImage 202, 116083 (2019)

2019
[4]

Approximating continuous functions by relu nets of minimal width.arXiv:1710.11278, 2017

Hanin, B., Sellke, M.: Approximating continuous functions by ReLU nets of minimal width. arXiv preprint arXiv:1710.11278 (2017)

work page arXiv 2017
[5]

In: Advances in Neural Information Processing Systems, 38 (2025)

Hwang, G.: Minimum width for deep, narrow MLP : A diffeomorphism approach. In: Advances in Neural Information Processing Systems, 38 (2025)

2025
[6]

Conference on Learning Theory 2306--2327 (2020)

Kidger, P., Lyons, T.: Universal approximation with deep narrow networks. Conference on Learning Theory 2306--2327 (2020)

2020
[7]

M.: Introduction to Smooth Manifolds

Lee, J. M.: Introduction to Smooth Manifolds. Second ed., Graduate Texts in Mathematics 218, Springer, New York (2013)

2013
[8]

S.: Extending diffeomorphisms

Palais, R. S.: Extending diffeomorphisms. Proceedings of the American Mathematical Society 11, 274--277 (1960)

1960
[9]

Advances in Neural Information Processing Systems 33, 3362--3373 (2020)

Teshima, T., Ishikawa, I., Tojo, K., Oono, K., Ikeda, M., Sugiyama, M.: Coupling-based invertible neural networks are universal diffeomorphism approximators. Advances in Neural Information Processing Systems 33, 3362--3373 (2020)

2020
[11]

Proceedings of the American Mathematical Society , volume=

Extending Diffeomorphisms , author=. Proceedings of the American Mathematical Society , volume=. 1960 , month=

1960
[12]

Minimum width for deep, narrow

Hwang, Geonho , booktitle=. Minimum width for deep, narrow
[13]

2013 , publisher=

Introduction to Smooth Manifolds , author=. 2013 , publisher=

2013
[14]

2020 , publisher=

Fundamentals of Pattern Recognition and Machine Learning , author=. 2020 , publisher=

2020
[15]

Separability and geometry of object manifolds in deep neural networks , author=. Nat. Commun. , volume=. 2020 , publisher=

2020
[16]

NeuroImage , volume=

Untangling featural and conceptual object representations , author=. NeuroImage , volume=. 2019 , publisher=

2019
[17]

arXiv preprint arXiv:2511.06837 , year=

Minimum Width of Deep Narrow Networks for Universal Approximation , author=. arXiv preprint arXiv:2511.06837 , year=

work page arXiv
[18]

Approximating continuous functions by

Hanin, Boris and Sellke, Mark , journal=. Approximating continuous functions by. 2017 , url=

2017
[19]

Conference on Learning Theory , pages=

Universal approximation with deep narrow networks , author=. Conference on Learning Theory , pages=. 2020 , organization=

2020
[20]

Advances in Neural Information Processing Systems , volume=

Coupling-based invertible neural networks are universal diffeomorphism approximators , author=. Advances in Neural Information Processing Systems , volume=. 2020 , url=

2020