Mixup Barcodes: Quantifying Geometric-Topological Interactions between Point Clouds

Hubert Wagner; Matthew Wheeler; Nickolas Arustamyan; Peter Bubenik

arxiv: 2402.15058 · v3 · pith:STQGU7MJnew · submitted 2024-02-23 · 🧮 math.AT · cs.CG· cs.LG

Mixup Barcodes: Quantifying Geometric-Topological Interactions between Point Clouds

Hubert Wagner , Nickolas Arustamyan , Matthew Wheeler , Peter Bubenik This is my paper

Pith reviewed 2026-05-24 03:46 UTC · model grok-4.3

classification 🧮 math.AT cs.CGcs.LG

keywords mixup barcodespersistent homologyimage persistent homologypoint cloudsgeometric-topological interactionstopological data analysisembeddingsdisentanglement

0 comments

The pith

Mixup barcodes combine standard and image persistent homology to measure geometric-topological interactions between point clouds.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper defines a mixup barcode that tracks how the geometric placement of points in two sets influences the topological features that appear when the sets interact. Simple summary statistics derived from these barcodes reduce the interactions to single numbers that quantify their overall complexity. The authors test the approach on embeddings from machine learning models to assess how well different classes remain separated. Unlike ordinary persistent homology, the construction remains sensitive to the actual locations of the topological features. A software implementation is supplied so users can compute the quantities on their own data.

Core claim

By merging standard persistent homology with image persistent homology the authors obtain a mixup barcode whose features record geometric-topological interactions between two arbitrary point sets. From this barcode they extract the scalars total mixup and total percentage mixup that serve as single-number measures of interaction complexity. When applied to class embeddings the resulting values indicate the degree of disentanglement and demonstrate sensitivity to the spatial positions of topological features.

What carries the argument

The mixup barcode obtained by combining standard persistent homology with image persistent homology on a pair of point clouds.

If this is right

The summary statistics supply a single scalar that ranks the complexity of interactions between any two point sets in any dimension.
The method can be used to quantify how well classes remain disentangled inside learned embeddings.
Because the construction is sensitive to feature locations it can distinguish configurations that ordinary persistent homology treats as identical.
The provided software allows direct computation of these quantities on new data sets.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same construction could be applied to pairs of shapes represented by meshes or density functions instead of raw point clouds.
Total mixup values might serve as an auxiliary loss term when training models that are required to preserve or suppress certain geometric-topological relations.
Systematic comparison of mixup barcodes against other two-set topological invariants could identify data regimes where location sensitivity is most useful.

Load-bearing premise

The specific pairing of standard and image persistent homology yields summary statistics that track genuine geometric-topological interactions rather than artifacts of the filtrations chosen.

What would settle it

Compute the total mixup statistic on two point clouds, then rigidly translate one cloud so that its topological features move relative to the other while preserving all individual homological features; if the statistic stays exactly the same the claim that it registers geometric interactions is refuted.

read the original abstract

We combine standard persistent homology with image persistent homology to define a novel way of characterizing shapes and interactions between them. In particular, we introduce: (1) a mixup barcode, which captures geometric-topological interactions (mixup) between two point sets in arbitrary dimension; (2) simple summary statistics, total mixup and total percentage mixup, which quantify the complexity of the interactions as a single number; (3) a software tool for playing with the above. As a proof of concept, we apply this tool to a problem arising from machine learning. In particular, we study the disentanglement in embeddings of different classes. The results suggest that topological mixup is a useful method for characterizing interactions for low and high-dimensional data. Compared to the typical usage of persistent homology, the new tool is sensitive to the geometric locations of the topological features, which is often desirable.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper defines mixup barcodes by combining ordinary persistent homology with image persistent homology, plus two scalar summaries and a software tool, then tests the idea on embedding disentanglement.

read the letter

The core new object is the mixup barcode, which records how two point clouds interact when their filtrations are combined in a specific way. The authors also introduce total mixup and total percentage mixup as single-number summaries of that interaction, and they supply code for computing them. That combination is not a direct reduction of anything already in the literature, so the construction itself is the main contribution. The proof-of-concept applies the summaries to check disentanglement across classes in learned embeddings, and the claim is that the new quantities pick up geometric placement of topological features in a way standard barcodes do not. The software tool is a practical plus for anyone who wants to experiment with the idea. The application stays at the level of suggestion rather than a controlled benchmark with error bars or direct comparison against other interaction measures, so the strength of the evidence is still modest. The central definitions appear to be well-posed on their own terms, with no obvious circularity or free parameters that would make the outputs tautological. Because the work is mainly a methodological proposal with an illustrative example, it is aimed at readers already working in topological data analysis who need to compare or embed point clouds in moderate dimensions. A serious referee could usefully check the formal properties of the mixup construction, the stability of the summaries, and whether the ML example adds convincing support. I would send it to review rather than desk-reject.

Referee Report

2 major / 1 minor

Summary. The manuscript defines the mixup barcode by combining standard persistent homology with image persistent homology to characterize geometric-topological interactions between two point clouds in arbitrary dimension. It introduces scalar summaries (total mixup and total percentage mixup), supplies a software tool, and applies the construction as a proof of concept to the problem of measuring disentanglement in machine-learning embeddings of different classes. The results are presented as suggesting that the new summaries are sensitive to the geometric locations of topological features and therefore useful for both low- and high-dimensional data.

Significance. If the mixup summaries can be shown to capture genuine interactions rather than construction artifacts, the method would extend the toolkit of topological data analysis by making geometric position explicit, which is frequently desirable when analyzing point clouds or embeddings. The provision of an accompanying software tool is a concrete strength that supports reproducibility and further experimentation.

major comments (2)

[Application section] Application section: the proof-of-concept reports only qualitative suggestions of utility; no quantitative metrics, statistical comparisons against standard persistent homology on the same point clouds, or error analysis are supplied, leaving the central claim that the tool is 'useful' without load-bearing empirical support.
[Definition of the mixup barcode] Definition of the mixup barcode: the construction combines two filtrations whose interaction is summarized by new scalar statistics, yet no invariance, stability, or non-degeneracy result is stated that would guarantee the summaries reflect geometric-topological content rather than filtration artifacts; this assumption is load-bearing for any claim of meaningful quantification.

minor comments (1)

[Abstract] Abstract: the description of the new objects would be clearer if at least one key formula or diagram were included so that readers can assess the construction without immediately consulting the body text.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive comments. We address each major comment below.

read point-by-point responses

Referee: [Application section] Application section: the proof-of-concept reports only qualitative suggestions of utility; no quantitative metrics, statistical comparisons against standard persistent homology on the same point clouds, or error analysis are supplied, leaving the central claim that the tool is 'useful' without load-bearing empirical support.

Authors: The manuscript is presented explicitly as a proof of concept introducing the mixup barcode and its scalar summaries. The application to class disentanglement in embeddings is intended to illustrate the method's sensitivity to geometric locations of topological features, which distinguishes it from standard persistent homology. While we acknowledge that quantitative metrics, statistical comparisons, and error analysis would provide additional empirical support, such elements are beyond the scope of this introductory work. The qualitative demonstration suffices to suggest utility for both low- and high-dimensional data. revision: no
Referee: [Definition of the mixup barcode] Definition of the mixup barcode: the construction combines two filtrations whose interaction is summarized by new scalar statistics, yet no invariance, stability, or non-degeneracy result is stated that would guarantee the summaries reflect geometric-topological content rather than filtration artifacts; this assumption is load-bearing for any claim of meaningful quantification.

Authors: The mixup barcode is constructed by combining standard persistent homology with image persistent homology to capture geometric-topological interactions between point clouds, with total mixup and total percentage mixup serving as direct scalar summaries of the resulting interaction. No invariance, stability, or non-degeneracy theorems are stated because the manuscript focuses on the definition of the construction and its application as a proof of concept. The application section provides concrete evidence that the summaries detect geometric positioning effects not captured by standard barcodes alone. revision: no

Circularity Check

0 steps flagged

No significant circularity identified

full rationale

The paper defines a new object (mixup barcode) by combining two existing, independently defined tools—standard persistent homology and image persistent homology—then introduces scalar summaries of that object. This is a straightforward definitional construction rather than a derivation that reduces to its own inputs. No equations are presented that equate a claimed prediction or result to a fitted parameter or self-referential quantity; no self-citations are invoked as load-bearing uniqueness theorems; and the application section is framed explicitly as a proof-of-concept whose results only 'suggest' utility. The derivation chain is therefore self-contained and does not exhibit any of the enumerated circularity patterns.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

Review performed on abstract only; full definitions, filtrations, and any parameter choices are not visible. The ledger therefore records only the high-level domain assumptions implied by the abstract.

axioms (1)

domain assumption Persistent homology and image persistent homology are well-defined functors on point clouds that produce barcodes encoding topological features.
Implicit in the decision to combine the two constructions; standard background in algebraic topology.

invented entities (1)

mixup barcode no independent evidence
purpose: To encode geometric-topological interactions between two point sets
Newly defined object whose properties are asserted to capture the desired interactions.

pith-pipeline@v0.9.0 · 5692 in / 1232 out tokens · 25354 ms · 2026-05-24T03:46:43.127202+00:00 · methodology

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Detecting Regime Transitions in Dynamical Systems via the Mixup Euler Characteristic Profile
math.DS 2026-04 unverdicted novelty 7.0

The Mixup Euler Characteristic Profile detects regime transitions via Euler characteristics of geometric intersections in delay-embedded trajectories, achieving 9.50 days MAE on Indian monsoon onset with 32% improveme...

Reference graph

Works this paper leans on

12 extracted references · 12 canonical work pages · cited by 1 Pith paper

[1]

A theory of adaptive pattern classifiers.IEEE Transactions on Electronic Computers, EC-16(3):299–307, 1967.doi:10.1109/PGEC.1967.264666

1 Shunichi Amari. A theory of adaptive pattern classifiers.IEEE Transactions on Electronic Computers, EC-16(3):299–307, 1967.doi:10.1109/PGEC.1967.264666. 2 Ulrich Bauer. Ripser: efficient computation of vietoris–rips persistence barcodes.Journal of Applied and Computational Topology, 5(3):391–423,

work page doi:10.1109/pgec.1967.264666 1967
[2]

URL: http://www.sciencedirect.com/science/ article/pii/S0747717116300098, doi:10.1016/j.jsc.2016.03.008

Algorithms and Software for Computational Topology. URL: http://www.sciencedirect.com/science/ article/pii/S0747717116300098, doi:10.1016/j.jsc.2016.03.008. 4 Ulrich Bauer and Michael Lesnick. Induced matchings and the algebraic stability of persistence barcodes. Journal of Computational Geometry , 6(2):162–191,

work page doi:10.1016/j.jsc.2016.03.008 2016
[3]

Why deep learning works: A manifold disentanglement perspective.IEEE Transactions on Neural Networks and Learning Systems, 27(10):1997–2008, 2016.doi:10.1109/TNNLS.2015.2496947

6 Pratik Prabhanjan Brahma, Dapeng Wu, and Yiyuan She. Why deep learning works: A manifold disentanglement perspective.IEEE Transactions on Neural Networks and Learning Systems, 27(10):1997–2008, 2016.doi:10.1109/TNNLS.2015.2496947. 7 David Cohen-Steiner, Herbert Edelsbrunner, John Harer, and Dmitriy Morozov. Persistent homology for kernels, images, and c...

work page doi:10.1109/tnnls.2015.2496947 1997
[4]

Persistent homology of chromatic alpha complexes.arXiv preprint arXiv:2212.03128,

9 Sebastiano Cultrera di Montesano, Ondřej Draganov, Herbert Edelsbrunner, and Morteza Saghafian. Persistent homology of chromatic alpha complexes.arXiv preprint arXiv:2212.03128,

work page arXiv
[5]

Wagner, N

H. Wagner, N. Arustamyan, M. Wheeler and P. Bubenik 17 10 Sebastiano Cultrera di Montesano, Ondrej Draganov, Herbert Edelsbrunner, and Morteza Saghafian. Chromatic topological data analysis.arXiv preprint arXiv:2406.04102 ,

work page arXiv
[6]

Visual feature extraction by a multilayered network of analog threshold elements

13 Kunihiko Fukushima. Visual feature extraction by a multilayered network of analog threshold elements. IEEE Transactions on Systems Science and Cybernetics , 5(4):322–333, 1969.doi: 10.1109/TSSC.1969.300225. 14 Rocío González-Díaz, Marta Soriano-Trigueros, and Alejandro Torras-Casas. Additive partial matchings induced by persistence morphisms.arXiv prep...

work page doi:10.1109/tssc.1969.300225 1969
[7]

Partial matchings induced by morphisms between persistence modules.arXiv preprint arXiv:2107.04519 ,

15 Rocío González-Díaz, Marta Soriano-Trigueros, and Álvaro Torras-Casas. Partial matchings induced by morphisms between persistence modules.arXiv preprint arXiv:2107.04519 ,

work page arXiv
[8]

17 Lei Li, Linda Yu-Ling Lan, Lei Huang, Congting Ye, Jorge Andrade, and Patrick C Wilson

Retrieved 2021-06-13.doi:10.1002/9780470316801.ch2. 17 Lei Li, Linda Yu-Ling Lan, Lei Huang, Congting Ye, Jorge Andrade, and Patrick C Wilson. Selecting representative samples from complex biological datasets using k-medoids clustering. Frontiers in Genetics, 13:954024,

work page doi:10.1002/9780470316801.ch2 2021
[9]

On measures of entropy and information,

Retrieved 2009-04-07. URL: https: //projecteuclid.org/euclid.bsmsp/1200512992. 19 Vinod Nair and Geoffrey E Hinton. Rectified linear units improve restricted boltzmann machines. In Proceedings of the 27th international conference on machine learning (ICML-10) , pages 807–814,

work page arXiv 2009
[10]

Morse theory for chromatic delaunay triangulations.arXiv preprint arXiv:2405.19303 ,

21 Abhinav Natarajan, Thomas Chaplin, Adam Brown, and Maria-Jose Jimenez. Morse theory for chromatic delaunay triangulations.arXiv preprint arXiv:2405.19303 ,

work page arXiv
[11]

URL: http://colah.github.io/posts/2014-03-NN-Manifolds-Topology/

Accessed: 2024-06-28. URL: http://colah.github.io/posts/2014-03-NN-Manifolds-Topology/. 23 Nico Stucki, Johannes C Paetzold, Suprosanna Shit, Bjoern Menze, and Ulrich Bauer. Topologi- cally faithful image segmentation via induced matching of persistence barcodes. InInternational Conference on Machine Learning , pages 32698–32727. PMLR,

work page 2024
[12]

Topological data quality via 0-dimensional persistence matchings.arXiv preprint arXiv:2306.02411 ,

25 Álvaro Torras-Casas, Eduardo Paluzo-Hidalgo, and Rocío González-Díaz. Topological data quality via 0-dimensional persistence matchings.arXiv preprint arXiv:2306.02411 ,

work page arXiv

[1] [1]

A theory of adaptive pattern classifiers.IEEE Transactions on Electronic Computers, EC-16(3):299–307, 1967.doi:10.1109/PGEC.1967.264666

1 Shunichi Amari. A theory of adaptive pattern classifiers.IEEE Transactions on Electronic Computers, EC-16(3):299–307, 1967.doi:10.1109/PGEC.1967.264666. 2 Ulrich Bauer. Ripser: efficient computation of vietoris–rips persistence barcodes.Journal of Applied and Computational Topology, 5(3):391–423,

work page doi:10.1109/pgec.1967.264666 1967

[2] [2]

URL: http://www.sciencedirect.com/science/ article/pii/S0747717116300098, doi:10.1016/j.jsc.2016.03.008

Algorithms and Software for Computational Topology. URL: http://www.sciencedirect.com/science/ article/pii/S0747717116300098, doi:10.1016/j.jsc.2016.03.008. 4 Ulrich Bauer and Michael Lesnick. Induced matchings and the algebraic stability of persistence barcodes. Journal of Computational Geometry , 6(2):162–191,

work page doi:10.1016/j.jsc.2016.03.008 2016

[3] [3]

Why deep learning works: A manifold disentanglement perspective.IEEE Transactions on Neural Networks and Learning Systems, 27(10):1997–2008, 2016.doi:10.1109/TNNLS.2015.2496947

6 Pratik Prabhanjan Brahma, Dapeng Wu, and Yiyuan She. Why deep learning works: A manifold disentanglement perspective.IEEE Transactions on Neural Networks and Learning Systems, 27(10):1997–2008, 2016.doi:10.1109/TNNLS.2015.2496947. 7 David Cohen-Steiner, Herbert Edelsbrunner, John Harer, and Dmitriy Morozov. Persistent homology for kernels, images, and c...

work page doi:10.1109/tnnls.2015.2496947 1997

[4] [4]

Persistent homology of chromatic alpha complexes.arXiv preprint arXiv:2212.03128,

9 Sebastiano Cultrera di Montesano, Ondřej Draganov, Herbert Edelsbrunner, and Morteza Saghafian. Persistent homology of chromatic alpha complexes.arXiv preprint arXiv:2212.03128,

work page arXiv

[5] [5]

Wagner, N

H. Wagner, N. Arustamyan, M. Wheeler and P. Bubenik 17 10 Sebastiano Cultrera di Montesano, Ondrej Draganov, Herbert Edelsbrunner, and Morteza Saghafian. Chromatic topological data analysis.arXiv preprint arXiv:2406.04102 ,

work page arXiv

[6] [6]

Visual feature extraction by a multilayered network of analog threshold elements

13 Kunihiko Fukushima. Visual feature extraction by a multilayered network of analog threshold elements. IEEE Transactions on Systems Science and Cybernetics , 5(4):322–333, 1969.doi: 10.1109/TSSC.1969.300225. 14 Rocío González-Díaz, Marta Soriano-Trigueros, and Alejandro Torras-Casas. Additive partial matchings induced by persistence morphisms.arXiv prep...

work page doi:10.1109/tssc.1969.300225 1969

[7] [7]

Partial matchings induced by morphisms between persistence modules.arXiv preprint arXiv:2107.04519 ,

15 Rocío González-Díaz, Marta Soriano-Trigueros, and Álvaro Torras-Casas. Partial matchings induced by morphisms between persistence modules.arXiv preprint arXiv:2107.04519 ,

work page arXiv

[8] [8]

17 Lei Li, Linda Yu-Ling Lan, Lei Huang, Congting Ye, Jorge Andrade, and Patrick C Wilson

Retrieved 2021-06-13.doi:10.1002/9780470316801.ch2. 17 Lei Li, Linda Yu-Ling Lan, Lei Huang, Congting Ye, Jorge Andrade, and Patrick C Wilson. Selecting representative samples from complex biological datasets using k-medoids clustering. Frontiers in Genetics, 13:954024,

work page doi:10.1002/9780470316801.ch2 2021

[9] [9]

On measures of entropy and information,

Retrieved 2009-04-07. URL: https: //projecteuclid.org/euclid.bsmsp/1200512992. 19 Vinod Nair and Geoffrey E Hinton. Rectified linear units improve restricted boltzmann machines. In Proceedings of the 27th international conference on machine learning (ICML-10) , pages 807–814,

work page arXiv 2009

[10] [10]

Morse theory for chromatic delaunay triangulations.arXiv preprint arXiv:2405.19303 ,

21 Abhinav Natarajan, Thomas Chaplin, Adam Brown, and Maria-Jose Jimenez. Morse theory for chromatic delaunay triangulations.arXiv preprint arXiv:2405.19303 ,

work page arXiv

[11] [11]

URL: http://colah.github.io/posts/2014-03-NN-Manifolds-Topology/

Accessed: 2024-06-28. URL: http://colah.github.io/posts/2014-03-NN-Manifolds-Topology/. 23 Nico Stucki, Johannes C Paetzold, Suprosanna Shit, Bjoern Menze, and Ulrich Bauer. Topologi- cally faithful image segmentation via induced matching of persistence barcodes. InInternational Conference on Machine Learning , pages 32698–32727. PMLR,

work page 2024

[12] [12]

Topological data quality via 0-dimensional persistence matchings.arXiv preprint arXiv:2306.02411 ,

25 Álvaro Torras-Casas, Eduardo Paluzo-Hidalgo, and Rocío González-Díaz. Topological data quality via 0-dimensional persistence matchings.arXiv preprint arXiv:2306.02411 ,

work page arXiv