pith. sign in

arxiv: 2604.08278 · v2 · submitted 2026-04-09 · 💻 cs.DS · cs.DB· cs.DM· cs.SI

Counting HyperGraphlets via Color Coding: a Quadratic Barrier and How to Break It

Pith reviewed 2026-05-10 17:35 UTC · model grok-4.3

classification 💻 cs.DS cs.DBcs.DMcs.SI
keywords hypergraphletscolor codingcountingorthogonal vectors conjecturehypergraphsmotifsalgorithmsniceness
0
0 comments X

The pith

Color coding cannot count k-hypergraphlets in sub-quadratic time under the Orthogonal Vectors Conjecture, but real-world hypergraphs that are (α,β)-nice allow an algorithm running in time 2^{O(k)} · (2^β |V| + α^k |E| + α² β ||H||).

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that color coding for counting k-hypergraphlets on hypergraphs faces a quadratic time barrier under the Orthogonal Vectors Conjecture, implying no faster implementations exist in the worst case. It then defines (α,β)-niceness as a property that real-world hypergraphs often satisfy for small α and β, allowing the hypergraph to be split into a low-rank part and a low-degree part. Different algorithmic techniques are applied to each part and combined, yielding the improved runtime. A reader would care because hypergraphlets serve as building blocks for analyzing higher-order interactions in networks, and breaking the barrier makes counting practical on large datasets.

Core claim

Under the Orthogonal Vectors Conjecture, no color coding implementation counts k-hypergraphlets in sub-quadratic time in the input size. For (α,β)-nice hypergraphs the time is 2^{O(k)} · (2^β |V| + α^k |E| + α² β ||H||) with uniform sampling in k^{O(k)} · (β² + ln |V|) per sample.

What carries the argument

The (α,β)-niceness property of hypergraphs, which permits splitting into a rank-at-most-α sub-hypergraph and a degree-at-most-β sub-hypergraph to apply tailored color coding techniques.

Load-bearing premise

Real-world hypergraphs satisfy (α,β)-niceness for small fixed values of α and β.

What would settle it

A hypergraph dataset from an application where the smallest α and β making it nice are both large enough that α^k |E| or 2^β |V| exceeds quadratic time in the input size.

Figures

Figures reproduced from arXiv: 2604.08278 by Giacomo Fumagalli, Marco Bressan, Stefano Clemente.

Figure 1
Figure 1. Figure 1: A toy hypergraph H (left) and its α-split into H≤α (right, first) and H>α (right, second) for α = 4. Our algorithm employs different techniques on the two sub-hypergraphs and combines the results carefully, both in the preprocessing and sampling phase. (V, E>α) has degree at most β. 5 All the hypergraphs used in our experiments appear to be (α, β)-nice for small values of α, β. We then give a two-phase alg… view at source ↗
Figure 2
Figure 2. Figure 2: The (α, β)-curve of our hypergraphs. For each α = 0, . . . ,rank(H), the curve shows the smallest of β such that the hypergraph is (α, β)-nice. The dot marks the point (α, β) chosen by Hy￾perMotivo. See Section 7.1 for more details. 6 HyperMotivo We describe HyperMotivo, our algorithm for sampling and approximately counting k-hypergraphlets through color coding. The main idea behind HyperMotivo is to explo… view at source ↗
Figure 3
Figure 3. Figure 3: Structure of HyperMotivo. White blocks are in common with Motivo; gray ones are intro￾duced in this work. • the build-up phase takes time 2 O(k) · O  2 β |V | + α k |E| + α 2β∥H∥  ; • the sampling phase takes, on each invocation, expected time k O(k) · (β 2 + ln |V |) and yields a colorful k-hypergraphlet H[U] uniformly at random. A high-level overview of the algorithm is shown in [PITH_FULL_IMAGE:figur… view at source ↗
Figure 4
Figure 4. Figure 4: Sensitivity of HyperMotivo to the choice of α ⋆ . For α varying around α ⋆ , the plot shows the ratio T(α)/T(α ⋆ ) between the build-up times using α and using α ⋆ . Our choice of α ⋆ yields good performance over all inputs. For AR and GP we omit points with running times exceeding 24 hours. 0.6 0.8 P 1.0 1.2 1.4 e |e| ×106 0 1000 2000 3000 4000 5000 Running time (s) HyperMotivo Motivo [PITH_FULL_IMAGE:fi… view at source ↗
Figure 5
Figure 5. Figure 5: The linear-time behavior of HyperMotivo and the quadratic-time behavior of the naive Algorithm 1, on hypergraphs produced by a simple random model. 7.3.2 The quadratic barrier and HyperMotivo [PITH_FULL_IMAGE:figures/full_fig_p022_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Reduction in wall-clock time and peak memory usage of [PITH_FULL_IMAGE:figures/full_fig_p023_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Running time of HyperMotivo’s build-up phase on the MA dataset, for k = 5, as a function of the number of CPU threads used. Doubling the number of threads almost halves the running time, at least up to 4 threads. SA RH AR AX GP SE MA Dataset 102 103 104 Samples/s HyperMotivo Motivo [PITH_FULL_IMAGE:figures/full_fig_p024_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: The sampling speed of HyperMotivo and Algorithm 1, in samples per second. of errH, clustered in bins in the form [err −0.25, err +0.25). It can be seen that HyperMotivo yields accurate estimates for most of them, with the error distribution concentrated around 0. A note of warning is needed here. The total number of (non-isomorphic) hypergraphlets on k vertices is doubly exponential in k; for k = 5, it is … view at source ↗
Figure 9
Figure 9. Figure 9: Distribution of per-hypergraphlet count errors on four synthetic inputs, in bins in the form [PITH_FULL_IMAGE:figures/full_fig_p025_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Frequency distribution of 5-hypergraphlets with highest aggregate frequency across all hypergraphs. show only the 5-hypergraphlets with highest aggregate frequency across all datasets. 8 Conclusions and future work We have shown that counting k-hypergraphlets is subtly harder than counting k-graphlets, but equally efficient algorithms exist under mild structural assumptions on the input hypergraphs. Our w… view at source ↗
Figure 11
Figure 11. Figure 11: The hypergraph H obtained by the reduction of Theorem 4 on G = K4. The next observation is useful to prove the correctness of the reduction. From now on, H[U] refers to the section hypergraph induced by U ⊆ V (H), as formalized in Definition 2.3. Observation 1. Let (G, k) be an instance of k-Clique and (H, k′ ) the instance of k-SH obtained via the reduction of Theorem 4. Let U ⊆ V (H) be such that H[U] i… view at source ↗
Figure 12
Figure 12. Figure 12: Memory–time tradeoff and absolute build-up timings for [PITH_FULL_IMAGE:figures/full_fig_p034_12.png] view at source ↗
Figure 13
Figure 13. Figure 13: Timings breakdown for k ∈ {3, 4, 5}. 35 [PITH_FULL_IMAGE:figures/full_fig_p035_13.png] view at source ↗
Figure 14
Figure 14. Figure 14: Frequency distributions of k-hypergraphlets for k ∈ {3, 4, 5}. B.2 Hypergraphlet Frequencies [PITH_FULL_IMAGE:figures/full_fig_p036_14.png] view at source ↗
Figure 15
Figure 15. Figure 15: First 20 4- and 5-hypergraphlets with highest aggregate frequency across all datasets. Left: K4. Right: K5. 37 [PITH_FULL_IMAGE:figures/full_fig_p037_15.png] view at source ↗
Figure 16
Figure 16. Figure 16: Running time of HyperMotivo’s build-up phase on the MA dataset for k ∈ {3, . . . , 8}. B.3 Parallel scalability Here we present parallel scalability experiments for the datasets MA ( [PITH_FULL_IMAGE:figures/full_fig_p038_16.png] view at source ↗
Figure 17
Figure 17. Figure 17: Running time of HyperMotivo’s build-up phase on the SE dataset for k ∈ {3, . . . , 8}. 39 [PITH_FULL_IMAGE:figures/full_fig_p039_17.png] view at source ↗
Figure 18
Figure 18. Figure 18: Running time of HyperMotivo’s build-up phase on the AX dataset for k ∈ {3, . . . , 8}. 40 [PITH_FULL_IMAGE:figures/full_fig_p040_18.png] view at source ↗
Figure 19
Figure 19. Figure 19: errH distributions for k ∈ {3, 4, 5} and 105 samples on four synthetic datasets. B.4 Accuracy We empirically assess the accuracy of HyperMotivo on small synthetic hypergraphs, following the same protocol as in the main paper. For each hypergraphlet isomorphism type H, we report the relative count error errH = (ˆcH − cH)/cH, where cH is the exact induced count of H in the input hypergraph and cˆH is the es… view at source ↗
read the original abstract

We study the problem of counting $k$-hypergraphlets, an interesting but surprisingly ignored primitive, with the aim of understanding whether efficient algorithms exist. To this end, we consider color coding, a well-known technique for approximately counting $k$-graphlets in graphs. Our first result is that, on hypergraphs, color coding encounters a quadratic barrier: under the Orthogonal Vector Conjecture, no implementation can run in sub-quadratic time in the input size. We then introduce a simple property, $(\alpha,\beta)$-niceness, that hypergraphs from real-world datasets appear to satisfy for small values of $\alpha$ and $\beta$. Intuitively, an $(\alpha,\beta)$-nice hypergraph can be split into two sub-hypergraphs having respectively rank at most $\alpha$ and degree at most $\beta$. By applying different techniques to each sub-hypergraph and carefully combining the outputs, we show how to run color coding in time $2^{O(k)} \cdot (2^\beta |V| + \alpha^k |E| + \alpha^2 \beta \|H\|)$, where $H=(V,E)$ is the input hypergraph. Afterwards, we can sample colorful $k$-hypergraphlets uniformly in expected $k^{O(k)} \cdot (\beta^2 + \ln |V|)$ time per sample. Experiments on real-world hypergraphs show that our algorithm significantly outperforms the naive quadratic algorithm, sometimes by more than an order of magnitude.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The manuscript establishes a quadratic barrier for color-coding based counting of k-hypergraphlets under the Orthogonal Vectors Conjecture. It defines the (α,β)-niceness property, which real-world hypergraphs appear to satisfy, allowing the input to be partitioned into a low-rank sub-hypergraph and a low-degree sub-hypergraph. The paper then presents an algorithm that applies color coding to each part separately and combines the results to achieve a running time of 2^{O(k)} · (2^β |V| + α^k |E| + α² β ||H||) for counting, along with an efficient uniform sampling procedure. Experimental results on real-world hypergraphs indicate substantial speedups compared to the quadratic baseline.

Significance. This work is significant in highlighting the limitations of direct application of color coding to hypergraphs and providing a practical workaround via the niceness property for datasets that satisfy it. The lower bound result is interesting and well-motivated. The algorithmic improvement, if rigorously proven, could enable new applications in hypergraph analysis. The empirical experiments are a positive aspect, showing real-world relevance. However, the reliance on an empirical property rather than a worst-case guarantee tempers the theoretical impact.

major comments (2)
  1. [Abstract] Abstract: the claimed running time 2^{O(k)} · (2^β |V| + α^k |E| + α² β ||H||) is obtained by running color coding separately on each part and combining the counts, but the manuscript provides no details on the combination procedure or error analysis for k-hypergraphlets whose edges cross the low-rank and low-degree sub-hypergraphs. This combination step is load-bearing for the central algorithmic claim and the asserted improvement over the quadratic barrier.
  2. [Niceness property definition] The (α,β)-niceness property is introduced as an empirical observation from real-world datasets rather than a structural theorem with an efficient partitioning algorithm. The time bound depends on small fixed α and β; without a formal definition, proof of efficient partitionability, or analysis of sensitivity to larger α/β, the sub-quadratic performance is not guaranteed even for the claimed class of inputs.
minor comments (1)
  1. [Abstract] The abstract uses ||H|| without an explicit definition (presumably the total size of the hypergraph); this should be clarified on first use.

Simulated Author's Rebuttal

2 responses · 1 unresolved

We thank the referee for the careful reading and constructive feedback. We appreciate the recognition of the lower bound's interest and the practical relevance of the niceness-based algorithm. We address the major comments point by point below.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the claimed running time 2^{O(k)} · (2^β |V| + α^k |E| + α² β ||H||) is obtained by running color coding separately on each part and combining the counts, but the manuscript provides no details on the combination procedure or error analysis for k-hypergraphlets whose edges cross the low-rank and low-degree sub-hypergraphs. This combination step is load-bearing for the central algorithmic claim and the asserted improvement over the quadratic barrier.

    Authors: We agree that additional details are needed. The manuscript outlines the partition into low-rank and low-degree sub-hypergraphs and applies color coding separately, but the explicit procedure for combining counts of crossing k-hypergraphlets (those with edges in both parts) and the associated error analysis are not elaborated. We will revise by adding a dedicated subsection describing the combination via linearity of expectation over edge-type distributions, together with a complete error analysis establishing that the overall (1+ε)-approximation is preserved. This will be incorporated in the algorithmic section. revision: yes

  2. Referee: [Niceness property definition] The (α,β)-niceness property is introduced as an empirical observation from real-world datasets rather than a structural theorem with an efficient partitioning algorithm. The time bound depends on small fixed α and β; without a formal definition, proof of efficient partitionability, or analysis of sensitivity to larger α/β, the sub-quadratic performance is not guaranteed even for the claimed class of inputs.

    Authors: The (α,β)-niceness property receives a formal definition in Section 3 as the existence of a partition of the edge set into a rank-at-most-α sub-hypergraph and a maximum-degree-at-most-β sub-hypergraph. It is presented as an empirical property observed on real-world instances rather than a worst-case structural guarantee. We will revise to make the definition more prominent, add a short discussion of practical heuristics for identifying the partition (e.g., iterative removal of high-degree vertices), and include sensitivity analysis showing how the running time scales with α and β. A general efficient algorithm for computing such a partition on arbitrary inputs is not provided and lies outside the scope of the current work. revision: partial

standing simulated objections not resolved
  • The absence of a polynomial-time algorithm to compute an (α,β)-nice partition for arbitrary hypergraphs, which is required to guarantee the improved running time in the worst case rather than only on empirically nice instances.

Circularity Check

0 steps flagged

No circularity: time bound derived from external conjecture and empirical niceness split

full rationale

The quadratic barrier follows from the Orthogonal Vectors Conjecture (external hardness assumption). The sub-quadratic bound for (α,β)-nice hypergraphs is obtained by partitioning edges into low-rank and low-degree sub-hypergraphs (defined directly from the niceness property), then invoking standard color-coding separately on each part and combining counts. The niceness property itself is introduced as an empirical observation on real-world data rather than derived from the algorithm or any fitted parameter. No self-citations, self-definitional equations, or renamings of known results appear in the derivation chain. The running-time expression is therefore a direct consequence of the split and standard techniques, not a tautology.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 1 invented entities

The central claims rest on the Orthogonal Vectors Conjecture and the empirical claim that real hypergraphs are (α,β)-nice; no free parameters are fitted inside the algorithm itself.

axioms (2)
  • domain assumption Orthogonal Vectors Conjecture
    Invoked to establish the quadratic lower bound for color coding on hypergraphs.
  • standard math Standard properties of color coding and random coloring for approximate counting
    Used as the base technique that is extended to the hypergraph setting.
invented entities (1)
  • (α,β)-niceness property no independent evidence
    purpose: To enable splitting the hypergraph into low-rank and low-degree sub-hypergraphs for separate processing
    Newly defined in the paper; no independent evidence provided beyond the claim that real datasets satisfy it for small α,β.

pith-pipeline@v0.9.0 · 5585 in / 1548 out tokens · 154192 ms · 2026-05-10T17:35:16.043286+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

23 extracted references · 23 canonical work pages

  1. [1]

    The shape of collaborations.EPJ Data Science, 6(1):18, 2017

    Alice Patania, Giovanni Petri, and Francesco Vaccarino. The shape of collaborations.EPJ Data Science, 6(1):18, 2017

  2. [2]

    Benson, Rediet Abebe, Michael T

    Austin R. Benson, Rediet Abebe, Michael T. Schaub, Ali Jadbabaie, and Jon Kleinberg. Simplicial closure and higher-order link prediction.Proceedings of the National Academy of Sciences, 115(48), November 2018

  3. [3]

    Higher-order motif analysis in hypergraphs.Communications Physics, 5(1), April 2022

    Quintino Francesco Lotito, Federico Musciotto, Alberto Montresor, and Federico Battiston. Higher-order motif analysis in hypergraphs.Communications Physics, 5(1), April 2022

  4. [4]

    Higher-order molecular organization as a source of biological function.Bioinformatics, 34(17):i944–i953, 09 2018

    Thomas Gaudelet, No ¨el Malod-Dognin, and Nataˇsa Prˇzulj. Higher-order molecular organization as a source of biological function.Bioinformatics, 34(17):i944–i953, 09 2018

  5. [5]

    Jakob Lykke Andersen, Christoph Flamm, Daniel Merkle, and Peter F. Stadler. Chemical trans- formation motifs—modelling pathways as integer hyperflows.IEEE/ACM Transactions on Com- putational Biology and Bioinformatics, 16:510–523, 2017

  6. [6]

    Juul, Austin R

    Jonas L. Juul, Austin R. Benson, and Jon Kleinberg. Hypergraph patterns and collaboration struc- ture.Frontiers in Physics, Volume 11 - 2023, 2024

  7. [7]

    Hypergraph motif representation learning

    Alessia Antelmi, Gennaro Cordasco, Daniele De Vinco, Valerio Di Pasquale, Mirko Polato, and Carmine Spagnuolo. Hypergraph motif representation learning. KDD ’25, page 13–24, New York, NY, USA, 2025. Association for Computing Machinery

  8. [8]

    Color-Coding.Journal of the ACM, 42(4):844–856, 1995

    Noga Alon, Raphael Yuster, and Uri Zwick. Color-Coding.Journal of the ACM, 42(4):844–856, 1995

  9. [9]

    Faster motif counting via succinct color coding and adaptive sampling.ACM Trans

    Marco Bressan, Stefano Leucci, and Alessandro Panconesi. Faster motif counting via succinct color coding and adaptive sampling.ACM Trans. Knowl. Discov. Data, 15(6), May 2021

  10. [10]

    Ex- act and sampling methods for mining higher-order motifs in large hypergraphs.Computing, 106(2):475–494, February 2024

    Quintino Francesco Lotito, Federico Musciotto, Federico Battiston, and Alberto Montresor. Ex- act and sampling methods for mining higher-order motifs in large hypergraphs.Computing, 106(2):475–494, February 2024

  11. [11]

    Juedes, Iyad A

    Jianer Chen, Benny Chor, Mike Fellows, Xiuzhen Huang, David W. Juedes, Iyad A. Kanj, and Ge Xia. Tight lower bounds for certain parameterized NP-hard problems.Inf. Comput., 201(2):216– 231, 2005

  12. [12]

    Motivo: Fast motif counting via suc- cinct color coding and adaptive sampling.Proc

    Marco Bressan, Stefano Leucci, and Alessandro Panconesi. Motivo: Fast motif counting via suc- cinct color coding and adaptive sampling.Proc. VLDB Endow., 12(11):1651–1663, July 2019

  13. [13]

    Hypergraph motifs: concepts, algorithms, and discoveries

    Geon Lee, Jihoon Ko, and Kijung Shin. Hypergraph motifs: concepts, algorithms, and discoveries. Proc. VLDB Endow., 13(12):2256–2269, July 2020

  14. [14]

    Efficient Detection of Network Motifs.IEEE/ACM Transactions on Computa- tional Biology and Bioinformatics, 3(4):347–359, October 2006

    Sebastian Wernicke. Efficient Detection of Network Motifs.IEEE/ACM Transactions on Computa- tional Biology and Bioinformatics, 3(4):347–359, October 2006

  15. [15]

    The complexity of counting small sub-hypergraphs, 2025

    Marco Bressan, Julian Brinkmann, Holger Dell, Marc Roth, and Philip Wellnitz. The complexity of counting small sub-hypergraphs, 2025

  16. [16]

    Tractable Hypergraph Properties for Constraint Satisfaction and Conjunctive Queries.Journal of the ACM, 60(6):42:1–42:51, 2013

    D ´aniel Marx. Tractable Hypergraph Properties for Constraint Satisfaction and Conjunctive Queries.Journal of the ACM, 60(6):42:1–42:51, 2013

  17. [17]

    Orthogonal vectors is hard for first-order properties on sparse graphs

    Jiawei Gao and Russell Impagliazzo. Orthogonal vectors is hard for first-order properties on sparse graphs. InElectronic Colloquium on Computational Complexity (ECCC), volume 23, page 53, 2016. 27

  18. [18]

    Lang, Anirban Dasgupta, and Michael W

    Jure Leskovec, Kevin J. Lang, Anirban Dasgupta, and Michael W. Mahoney. Community structure in large networks: Natural cluster sizes and the absence of large well-defined clusters.Internet Mathematics, 6:123 – 29, 2008

  19. [19]

    Motif counting beyond five nodes.ACM TKDD, 12(4), 2018

    Marco Bressan, Flavio Chierichetti, Ravi Kumar, Stefano Leucci, and Alessandro Panconesi. Motif counting beyond five nodes.ACM TKDD, 12(4), 2018

  20. [20]

    M. D. Vose. A linear algorithm for generating random numbers with a given distribution.IEEE Transactions on Software Engineering, 17(9):972–975, 1991

  21. [21]

    Benson, and Jon Kleinberg

    Nate Veldt, Austin R. Benson, and Jon Kleinberg. Minimizing localized ratio cut objectives in hypergraphs. InProceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM Press, 2020

  22. [22]

    Justifying recommendations using distantly-labeled reviews and fine-grained aspects

    Jianmo Ni, Jiacheng Li, and Julian McAuley. Justifying recommendations using distantly-labeled reviews and fine-grained aspects. InProceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 188–197, 2019

  23. [23]

    Kanj, and Ge Xia

    Jianer Chen, Xiuzhen Huang, Iyad A. Kanj, and Ge Xia. Strong computational lower bounds via parameterized complexity.Journal of Computer and System Sciences, 72(8):1346 – 1367, 2006. 28 APPENDIX This section complements the main paper by providing additional material that was omitted from the main text due to space constraints. It is organized as follows:...