pith. sign in

arxiv: 2605.16527 · v1 · pith:RVW7CS3Onew · submitted 2026-05-15 · 💻 cs.LG · cs.AI

Hypergraph Pattern Machine: Compositional Tokenization for Higher-Order Interactions

Pith reviewed 2026-05-20 20:04 UTC · model grok-4.3

classification 💻 cs.LG cs.AI
keywords hypergraph learningcompositional tokenizationhigher-order interactionsmasked reconstructioninclusion DAGpolypharmacyadverse event predictionTransformer
0
0 comments X

The pith

The Hypergraph Pattern Machine learns whether higher-order relations are compositional, emergent, or inhibitory by tokenizing subsets and reconstructing them on an inclusion DAG.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Hypergraphs capture relations beyond pairs, such as drug triples whose effects cannot be reduced to individual drugs. The central signal is compositionality: whether a triple simplifies to its parts, requires all members together, or is blocked by one member. Existing hypergraph methods pass messages over full observed edges and therefore miss these distinctions, allowing misclassified combinations in applications like polypharmacy. HGPM instead breaks hyperedges into compositional subsets, arranges them in an inclusion DAG, and trains an inclusion-aware Transformer to reconstruct masked tokens. On ten benchmarks it matches or exceeds prior methods; in a real adverse-event case it alone identifies the single inhibitory drug among otherwise identical candidates.

Core claim

Shifting from message passing over observed hyperedges to learning the compositional pattern of subsets—by tokenizing them, organizing the tokens in an inclusion DAG, and training under masked reconstruction—captures signals of compositionality, emergence, and inhibition that determine whether a higher-order relation can be simplified, must be kept intact, or is disrupted by one of its members.

What carries the argument

Tokenization of compositional subsets organized into an inclusion DAG, processed by an inclusion-aware Transformer under masked reconstruction.

If this is right

  • The model matches or exceeds state-of-the-art accuracy on ten hypergraph benchmarks.
  • It alone distinguishes the drug addition that inhibits an adverse effect among candidates that share identical features.
  • Modeling compositionality prevents misclassification of dangerous drug combinations that message-passing methods overlook.
  • The same tokenization-plus-reconstruction approach applies to any domain where higher-order relations carry compositional meaning.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The inclusion-DAG construction could be adapted to model emergent group effects in social or biological networks without labeled composition data.
  • Replacing the reconstruction objective with an explicit inhibitory-label loss would test whether the current unsupervised signal is the main driver of the reported discrimination.
  • The method suggests a general route for injecting higher-order pattern awareness into any Transformer that currently operates only on flat sets or pairs.

Load-bearing premise

The compositional, emergent, and inhibitory signals present in hyperedges are sufficiently captured by tokenizing subsets and performing masked reconstruction on an inclusion DAG without extra supervision or loss of critical structural information.

What would settle it

A hypergraph dataset in which the model cannot identify the single inhibitory addition among feature-identical candidates, or in which its accuracy falls below standard message-passing baselines on any task that requires distinguishing compositional from emergent or inhibitory relations.

Figures

Figures reproduced from arXiv: 2605.16527 by Fang Wu, Kyrie Zhao, Pietro Lio, Sheng Wang, Tianyi Ma, Xiangru Tang, Yanfang Ye, Zehong Wang.

Figure 1
Figure 1. Figure 1: Interaction compositionality. (a) Existing hypergraph methods propagate messages only over observed hyperedges (a.1), failing to distinguish hyperedge pairs whose compositionality structure differs (a.2). (b) Each adjacent-order subset pair falls into one of three regimes by which endpoints are observed: compositional (b.1), emergent (b.2), and inhibitory (b.3). (c) HGPM tokenizes observed (solid) and unob… view at source ↗
Figure 2
Figure 2. Figure 2: Hypergraph Pattern Machine. For each target entity, HGPM tokenizes subsets containing it and organizes them as an inclusion DAG with composition-labeled edges (left). A masked￾reconstruction objective jointly supervises subset semantics and existence (middle). Pairwise struc￾tural biases inject inclusion topology and composition labels into self-attention (right). Why This Requires Going Beyond Message Pas… view at source ↗
Figure 3
Figure 3. Figure 3: Cross-order generalization on HODDI. We report the test AUROC for Fixed Training on orders {2, 3} and Progressive Training on {2, . . . , k−1} for each test order k. Cross-Order Generalization. Higher-order drug interaction data is intrinsically heavy-tailed: the combinatorial space grows as |V| k  and re￾porting bias toward common regimens leaves the tail sparse. HODDI, for instance, contains roughly 49,… view at source ↗
Figure 4
Figure 4. Figure 4: Case study on selecting a suppressive add-on for peripheral neuropathy. high-order target through a chain of inclusion edges to lower-order subsets that are dense in training, transporting the dense low-order signal into the sparse tail. 6.3 Case Study We close with a polypharmacy case study. A JADER [52] peripheral-neuropathy report documents a patient on FOLFOX [12] (oxaliplatin, fluorouracil, levofolina… view at source ↗
Figure 5
Figure 5. Figure 5: Binary link-prediction score for each candidate added to FOLFOX, comparing HGPM with [PITH_FULL_IMAGE:figures/full_fig_p028_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Top-3 adverse events predicted by the feature-similarity baseline for each candidate added to FOLFOX. Peripheral neuropathy ranks first across all three branches, indicating that the baseline cannot resolve regimen-specific adverse-event signatures from drug-feature similarity alone. 28 [PITH_FULL_IMAGE:figures/full_fig_p028_6.png] view at source ↗
read the original abstract

Hypergraphs model higher-order relations that drive real-world decisions, from drug prescriptions to recommendations. A central structural signal in such data, beyond what pairwise relations can express, is interaction compositionality: whether a higher-order relation is compositional, emergent, or inhibitory with respect to its observed or unobserved sets. In polypharmacy, the regime decides whether a drug should be dropped, kept, or excluded: a compositional drug triple can be safely simplified, an emergent triple requires all drugs jointly, and an inhibitory triple flags a drug that disrupts an existing interaction. However, existing hypergraph learning methods, which merely propagate messages over observed hyperedges, leave this compositional signal unmodeled, allowing dangerous drug combinations to slip through and be misclassified. To this end, we propose the Hypergraph Pattern Machine (HGPM), shifting the paradigm from message passing to learning the compositional pattern of subsets. It tokenizes compositional subsets, organizes them in an inclusion DAG, and trains an inclusion-aware Transformer under masked reconstruction. On ten hypergraph benchmarks, HGPM matches or exceeds state-of-the-art methods. Notably, in a real adverse-event prediction case, HGPM correctly identifies the drug addition that inhibits the side effect among feature-identical candidates, a discrimination existing methods cannot make. The code and data are in https://github.com/KryieZhao/HGPM.git.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript introduces the Hypergraph Pattern Machine (HGPM), which tokenizes subsets of hyperedges, organizes them into an inclusion DAG, and trains an inclusion-aware Transformer via masked reconstruction to model compositional, emergent, and inhibitory higher-order interactions in hypergraphs. It claims this paradigm shift from message passing enables better handling of interaction regimes, with HGPM matching or exceeding SOTA on ten benchmarks and correctly identifying an inhibitory drug addition in a real adverse-event prediction case among feature-identical candidates.

Significance. If the central claims hold, the work offers a meaningful alternative to standard hypergraph message-passing methods by directly targeting compositional patterns, with clear relevance to safety-critical applications such as polypharmacy. The open release of code and data is a positive contribution to reproducibility. The practical case study adds value, though the overall significance hinges on whether the architecture genuinely captures inhibitory signals by design rather than through indirect data statistics.

major comments (2)
  1. [§3.2] §3.2 (Training objective and inclusion-aware Transformer): The masked reconstruction loss completes observed positive structures in the inclusion DAG but contains no explicit negative supervision, contrastive terms, or penalty for non-inclusion. It is therefore unclear how the model distinguishes inhibitory (disruptive) hyperedges from compositional or emergent ones by design rather than via incidental data statistics; this assumption is load-bearing for the claim that all three regimes are modeled without additional supervision.
  2. [§5.2] §5.2 (Adverse-event case study): The qualitative demonstration that HGPM identifies the inhibitory drug addition among feature-identical candidates is presented without quantitative scores, baseline outputs on the same candidates, ablation on DAG construction, or error analysis. This weakens the assertion that existing methods cannot make the discrimination and leaves the practical claim difficult to evaluate.
minor comments (2)
  1. [Abstract] Abstract: The statement that HGPM 'matches or exceeds state-of-the-art methods' on ten benchmarks would be strengthened by naming the benchmarks or reporting at least one key metric (e.g., average improvement).
  2. [§3.1] Notation and figures: The construction of the inclusion DAG from raw hyperedges would benefit from an explicit algorithmic description or pseudocode to clarify how unobserved subsets are handled.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the thoughtful and detailed comments. We address each major point below, clarifying our approach and indicating where we will revise the manuscript to strengthen the presentation.

read point-by-point responses
  1. Referee: [§3.2] §3.2 (Training objective and inclusion-aware Transformer): The masked reconstruction loss completes observed positive structures in the inclusion DAG but contains no explicit negative supervision, contrastive terms, or penalty for non-inclusion. It is therefore unclear how the model distinguishes inhibitory (disruptive) hyperedges from compositional or emergent ones by design rather than via incidental data statistics; this assumption is load-bearing for the claim that all three regimes are modeled without additional supervision.

    Authors: The inclusion DAG encodes hierarchical subset relationships derived from observed hyperedges, so that masked reconstruction must learn to complete or reject patterns based on consistency with the full observed structure. Inhibitory cases manifest as low reconstruction likelihood for subsets that would otherwise be expected under compositional rules, because the attention mechanism is conditioned on inclusion relations. We agree that the current description leaves this implicit and that an explicit discussion would help. In revision we will expand §3.2 with a paragraph deriving how the objective separates the three regimes via the DAG structure and will add a short supplementary analysis comparing reconstruction probabilities on inhibitory versus compositional examples. revision: partial

  2. Referee: [§5.2] §5.2 (Adverse-event case study): The qualitative demonstration that HGPM identifies the inhibitory drug addition among feature-identical candidates is presented without quantitative scores, baseline outputs on the same candidates, ablation on DAG construction, or error analysis. This weakens the assertion that existing methods cannot make the discrimination and leaves the practical claim difficult to evaluate.

    Authors: We accept that the case study would be more convincing with quantitative backing. In the revised manuscript we will augment §5.2 with (i) prediction scores for HGPM and the baselines on the exact candidate set, (ii) an ablation that removes the inclusion DAG while keeping the same tokenization, and (iii) a short error analysis of the misclassifications produced by the baselines. These additions will directly support the claim that the discrimination is not achieved by prior methods. revision: yes

Circularity Check

0 steps flagged

No circularity: HGPM introduces independent modeling shift via subset tokenization and masked reconstruction on inclusion DAG

full rationale

The paper presents HGPM as a new architecture that tokenizes compositional subsets, builds an inclusion DAG, and applies masked reconstruction with an inclusion-aware Transformer. No equations or derivations are shown that reduce the reported performance or inhibitory discrimination to a fitted quantity defined in terms of the target labels or by self-referential construction. Claims rest on empirical results across benchmarks and a real adverse-event case study rather than any self-definitional loop, fitted-input-as-prediction, or load-bearing self-citation chain. The method is framed as a paradigm shift from message passing, with the three regimes (compositional/emergent/inhibitory) addressed through the structural organization and reconstruction objective without reducing to tautology.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 1 invented entities

Abstract-only review yields limited visibility into implementation details; the ledger records the core modeling assumptions stated or implied by the abstract.

axioms (2)
  • domain assumption Higher-order relations in hypergraphs carry distinguishable compositional, emergent, or inhibitory signals relative to their subsets.
    Invoked when the abstract states that existing message-passing leaves this signal unmodeled.
  • domain assumption An inclusion DAG over tokenized subsets preserves the necessary structure for masked reconstruction to recover interaction types.
    Central to the described pipeline of tokenization, DAG organization, and inclusion-aware training.
invented entities (1)
  • Inclusion-aware Transformer no independent evidence
    purpose: Transformer variant that respects the inclusion DAG during masked reconstruction of compositional patterns.
    Introduced as the training architecture; no independent evidence of its properties is supplied in the abstract.

pith-pipeline@v0.9.0 · 5791 in / 1497 out tokens · 60996 ms · 2026-05-20T20:04:12.873174+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

74 extracted references · 74 canonical work pages

  1. [1]

    data2vec: A general framework for self-supervised learning in speech, vision and language

    Alexei Baevski, Wei-Ning Hsu, Qiantong Xu, Arun Babu, Jiatao Gu, and Michael Auli. data2vec: A general framework for self-supervised learning in speech, vision and language. InInternational Conference on Machine Learning (ICML), 2022

  2. [2]

    Song Bai, Feihu Zhang, and Philip H. S. Torr. Hypergraph convolution and hypergraph attention.Pattern Recognition, 2021

  3. [3]

    Mamdami, David N

    Marisa Battistella, Muhammad M. Mamdami, David N. Juurlink, Linda Rabeneck, and Andreas Laupacis. Risk of upper gastrointestinal hemorrhage in warfarin users treated with nonselective NSAIDs or COX-2 inhibitors.Archives of Internal Medicine, 2005

  4. [4]

    Montúfar, and Michael Bronstein

    Cristian Bodnar, Fabrizio Frasca, Nina Otter, Yu Guang Wang, Pietro Liò, Guido F. Montúfar, and Michael Bronstein. Weisfeiler and lehman go cellular: CW networks. InAdvances in Neural Information Processing Systems (NeurIPS), 2021

  5. [5]

    Weisfeiler and lehman go topological: Message passing simplicial networks

    Cristian Bodnar, Fabrizio Frasca, Yu Guang Wang, Nina Otter, Guido Montúfar, Pietro Liò, and Michael Bronstein. Weisfeiler and lehman go topological: Message passing simplicial networks. InProceedings of the 38th International Conference on Machine Learning (ICML), 2021

  6. [6]

    An optimal lower bound on the number of variables for graph identification.Combinatorica, 1992

    Jin-Yi Cai, Martin Fürer, and Neil Immerman. An optimal lower bound on the number of variables for graph identification.Combinatorica, 1992

  7. [7]

    Emerging properties in self-supervised vision transformers

    Mathilde Caron, Hugo Touvron, Ishan Misra, Hervé Jégou, Julien Mairal, Piotr Bojanowski, and Armand Joulin. Emerging properties in self-supervised vision transformers. InIEEE/CVF International Conference on Computer Vision (ICCV), 2021

  8. [8]

    Chemotherapy-induced peripheral neurotoxicity.Nature Reviews Neurology, 2010

    Guido Cavaletti and Paola Marmiroli. Chemotherapy-induced peripheral neurotoxicity.Nature Reviews Neurology, 2010

  9. [9]

    MUFFIN: multi-scale feature fusion for drug–drug interaction prediction.Bioinformatics, 2021

    Yujie Chen, Tengfei Ma, Xixi Yang, Jianmin Wang, Bosheng Song, and Xiangxiang Zeng. MUFFIN: multi-scale feature fusion for drug–drug interaction prediction.Bioinformatics, 2021

  10. [10]

    You are AllSet: A multiset function framework for hypergraph neural networks

    Eli Chien, Chao Pan, Jianhao Peng, and Olgica Milenkovic. You are AllSet: A multiset function framework for hypergraph neural networks. InInternational Conference on Learning Representations (ICLR), 2022

  11. [11]

    Theoretical basis, experimental design, and computerized simulation of synergism and antagonism in drug combination studies.Pharmacological Reviews, 2006

    Ting-Chao Chou. Theoretical basis, experimental design, and computerized simulation of synergism and antagonism in drug combination studies.Pharmacological Reviews, 2006

  12. [12]

    Figer, M

    Aimery de Gramont, A. Figer, M. Seymour, M. Homerin, A. Hmissi, J. Cassidy, C. Boni, H. Cortes-Funes, A. Cervantes, G. Freyer, D. Papamichael, N. Le Bail, C. Louvet, D. Hendler, F. de Braud, C. Wilson, F. Morvan, and A. Bonetti. Leucovorin and fluorouracil with or without oxaliplatin as first-line treatment in advanced colorectal cancer.Journal of Clinica...

  13. [13]

    HNHN: Hypergraph networks with hyperedge neurons

    Yihe Dong, Will Sawin, and Yoshua Bengio. HNHN: Hypergraph networks with hyperedge neurons. In ICML Graph Representation Learning and Beyond (GRL+) Workshop, 2020

  14. [14]

    Sheaf hypergraph networks

    Iulia Duta, Giulia Cassarà, Fabrizio Silvestri, and Pietro Liò. Sheaf hypergraph networks. InAdvances in Neural Information Processing Systems (NeurIPS), 2023

  15. [15]

    Simplicial neural networks

    Stefania Ebli, Michaël Defferrard, and Gard Spreemann. Simplicial neural networks. InNeurIPS 2020 Workshop on Topological Data Analysis and Beyond, 2020

  16. [16]

    Floor Eijkelboom, Rob Hesselink, and Erik J. Bekkers. E(n) equivariant message passing simplicial networks. InProceedings of the 40th International Conference on Machine Learning (ICML), 2023

  17. [17]

    Hypergraph neural networks

    Yifan Feng, Haoxuan You, Zizhao Zhang, Rongrong Ji, and Yue Gao. Hypergraph neural networks. In Proceedings of the AAAI Conference on Artificial Intelligence, 2019

  18. [18]

    Hypergraph foundation model.arXiv preprint arXiv:2503.01203, 2025

    Yue Gao, Yifan Feng, Shiquan Liu, Xiangmin Han, Shaoyi Du, Zongze Wu, and Han Hu. Hypergraph foundation model.arXiv preprint arXiv:2503.01203, 2025. 10

  19. [19]

    Richemond, Elena Buchatskaya, Carl Doersch, Bernardo Avila Pires, Zhaohan Daniel Guo, Mohammad Gheshlaghi Azar, Bilal Piot, Koray Kavukcuoglu, Rémi Munos, and Michal Valko

    Jean-Bastien Grill, Florian Strub, Florent Altché, Corentin Tallec, Pierre H. Richemond, Elena Buchatskaya, Carl Doersch, Bernardo Avila Pires, Zhaohan Daniel Guo, Mohammad Gheshlaghi Azar, Bilal Piot, Koray Kavukcuoglu, Rémi Munos, and Michal Valko. Bootstrap your own latent: A new approach to self- supervised learning. InAdvances in Neural Information P...

  20. [20]

    Topological deep learning: Going beyond graph data,

    Mustafa Hajij, Ghada Zamzmi, Theodore Papamarkou, Nina Miolane, Aldo Guzmán-Sáenz, Karthikeyan Natesan Ramamurthy, Tolga Birdal, Tamal K. Dey, Soham Mukherjee, Shreyas N. Samaga, Neal Livesay, Robin Walters, Paul Rosen, and Michael T. Schaub. Topological deep learning: Going beyond graph data.arXiv preprint arXiv:2206.00606, 2023

  21. [21]

    Harsanyi

    John C. Harsanyi. A simplified bargaining model for the n-person cooperative game.International Economic Review, 1963

  22. [22]

    Holbrook, Jennifer A

    Anne M. Holbrook, Jennifer A. Pereira, Renée Labiris, Heather McDonald, James D. Douketis, Mark Crowther, and Philip S. Wells. Systematic overview of warfarin and its drug and food interactions.Archives of Internal Medicine, 2005

  23. [23]

    GraphMAE: Self-supervised masked graph autoencoders

    Zhenyu Hou, Xiao Liu, Yukuo Cen, Yuxiao Dong, Hongxia Yang, Chunjie Wang, and Jie Tang. GraphMAE: Self-supervised masked graph autoencoders. InProceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), 2022

  24. [24]

    Graph- MAE2: A decoding-enhanced masked self-supervised graph learner

    Zhenyu Hou, Yufei He, Yukuo Cen, Xiao Liu, Yuxiao Dong, Evgeny Kharlamov, and Jie Tang. Graph- MAE2: A decoding-enhanced masked self-supervised graph learner. InProceedings of the ACM Web Conference (WWW), 2023

  25. [25]

    UniGNN: a unified framework for graph and hypergraph neural networks

    Jing Huang and Jie Yang. UniGNN: a unified framework for graph and hypergraph neural networks. In Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence (IJCAI), 2021

  26. [26]

    Grape: Knowledge graph enhanced passage reader for open-domain question answering

    Mingxuan Ju, Wenhao Yu, Tong Zhao, Chuxu Zhang, and Yanfang Ye. Grape: Knowledge graph enhanced passage reader for open-domain question answering. InFindings of the Association for Computational Linguistics: EMNLP 2022, pages 169–181, 2022

  27. [27]

    Kipf and Max Welling

    Thomas N. Kipf and Max Welling. Semi-supervised classification with graph convolutional networks. In International Conference on Learning Representations (ICLR), 2017

  28. [28]

    Krantz and Harold R

    Steven G. Krantz and Harold R. Parks.A Primer of Real Analytic Functions. Birkhäuser, 2002

  29. [29]

    Hamilton, Vincent Létourneau, and Prudencio Tossou

    Devin Kreuzer, Dominique Beaini, William L. Hamilton, Vincent Létourneau, and Prudencio Tossou. Rethinking graph transformers with spectral attention. InAdvances in Neural Information Processing Systems (NeurIPS), 2021

  30. [30]

    Deep hypergraph neural networks with tight framelets

    Ming Li, Yujie Fang, Yi Wang, Han Feng, Yongchun Gu, Lu Bai, and Pietro Liò. Deep hypergraph neural networks with tight framelets. InProceedings of the AAAI Conference on Artificial Intelligence, 2025

  31. [31]

    Graph is a substrate across data modalities

    Ziming Li, Xiaoming Wu, Zehong Wang, Jiazheng Li, Yijun Tian, Jinhe Bi, Yunpu Ma, Yanfang Ye, and Chuxu Zhang. Graph is a substrate across data modalities. InInternational Conference on Machine Learning (ICML), 2026

  32. [32]

    Adaptive expansion for hypergraph learning.arXiv preprint arXiv:2502.15564, 2025

    Tianyi Ma, Yiyue Qian, Shinan Zhang, Chuxu Zhang, and Yanfang Ye. Adaptive expansion for hypergraph learning.arXiv preprint arXiv:2502.15564, 2025

  33. [33]

    Hypergraph representation learning with adaptive broadcasting and receiving

    Tianyi Ma, Yiyue Qian, Zheyuan Zhang, Zehong Wang, Shinan Zhang, Chuxu Zhang, and Yanfang Ye. Hypergraph representation learning with adaptive broadcasting and receiving. InICDM, 2025

  34. [34]

    Bhygnn+: Unsupervised representation learning for heterophilic hypergraphs.arXiv preprint arXiv:2602.14919, 2026

    Tianyi Ma, Yiyue Qian, Zehong Wang, Zheyuan Zhang, Chuxu Zhang, and Yanfang Ye. Bhygnn+: Unsupervised representation learning for heterophilic hypergraphs.arXiv preprint arXiv:2602.14919, 2026

  35. [35]

    Temporal graph pattern machine.arXiv preprint arXiv:2601.22454, 2026

    Yijun Ma, Zehong Wang, Weixiang Sun, and Yanfang Ye. Temporal graph pattern machine.arXiv preprint arXiv:2601.22454, 2026

  36. [36]

    The zero set of a real analytic function.Mathematical Notes, 2020

    Boris Mityagin. The zero set of a real analytic function.Mathematical Notes, 2020

  37. [37]

    Hamilton, Jan Eric Lenssen, Gaurav Rattan, and Martin Grohe

    Christopher Morris, Martin Ritzert, Matthias Fey, William L. Hamilton, Jan Eric Lenssen, Gaurav Rattan, and Martin Grohe. Weisfeiler and Leman go neural: Higher-order graph neural networks. InProceedings of the AAAI Conference on Artificial Intelligence, 2019

  38. [38]

    SPARSE: a sparse hypergraph neural network for learning multiple types of latent combinations to accurately predict drug– drug interactions.Bioinformatics, 2022

    Duc Anh Nguyen, Canh Hao Nguyen, Peter Petschner, and Hiroshi Mamitsuka. SPARSE: a sparse hypergraph neural network for learning multiple types of latent combinations to accurately predict drug– drug interactions.Bioinformatics, 2022. 11

  39. [39]

    SSI-DDI: substructure–substructure interactions for drug–drug interaction prediction.Briefings in Bioinformatics, 2021

    Arnold K Nyamabo, Hui Yu, and Jian-Yu Shi. SSI-DDI: substructure–substructure interactions for drug–drug interaction prediction.Briefings in Bioinformatics, 2021

  40. [40]

    Deep learning for high-order drug-drug interaction prediction

    Bo Peng and Xia Ning. Deep learning for high-order drug-drug interaction prediction. InProceedings of the 10th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics, 2019

  41. [41]

    DeepSynergy: predicting anti-cancer drug synergy with deep learning.Bioinformatics, 2018

    Kristina Preuer, Richard P I Lewis, Sepp Hochreiter, Andreas Bender, Krishna C Bulusu, and Günter Klambauer. DeepSynergy: predicting anti-cancer drug synergy with deep learning.Bioinformatics, 2018

  42. [42]

    Co-modality graph contrastive learning for imbalanced node classification.Advances in Neural Information Processing Systems, 35:15862–15874, 2022

    Yiyue Qian, Chunhui Zhang, Yiming Zhang, Qianlong Wen, Yanfang Ye, and Chuxu Zhang. Co-modality graph contrastive learning for imbalanced node classification.Advances in Neural Information Processing Systems, 35:15862–15874, 2022

  43. [43]

    Adaptive graph enhancement for imbalanced multi-relation graph learning

    Yiyue Qian, Tianyi Ma, Chuxu Zhang, and Yanfang Ye. Adaptive graph enhancement for imbalanced multi-relation graph learning. InWSDM, 2025

  44. [44]

    Recipe for a general, powerful, scalable graph transformer

    Ladislav Rampášek, Mikhail Galkin, Vijay Prakash Dwivedi, Anh Tuan Luu, Guy Wolf, and Dominique Beaini. Recipe for a general, powerful, scalable graph transformer. InAdvances in Neural Information Processing Systems (NeurIPS), 2022

  45. [45]

    On the foundations of combinatorial theory I: Theory of Möbius functions.Zeitschrift für Wahrscheinlichkeitstheorie und Verwandte Gebiete, 1964

    Gian-Carlo Rota. On the foundations of combinatorial theory I: Theory of Möbius functions.Zeitschrift für Wahrscheinlichkeitstheorie und Verwandte Gebiete, 1964

  46. [46]

    Deep learning improves prediction of drug–drug and drug–food interactions.Proceedings of the National Academy of Sciences, 2018

    Jae Yong Ryu, Hyun Uk Kim, and Sang Yup Lee. Deep learning improves prediction of drug–drug and drug–food interactions.Proceedings of the National Academy of Sciences, 2018

  47. [47]

    HyGNN: Drug-drug interaction prediction via hypergraph neural network

    Khaled Mohammed Saifuddin, Briana Bumgardner, Farhan Tanvir, and Esra Akbas. HyGNN: Drug-drug interaction prediction via hypergraph neural network. In2023 IEEE 39th International Conference on Data Engineering (ICDE), 2023

  48. [48]

    Li Sun, Ming Zhang, Wenxin Jin, Zhongtian Sun, Zhenhao Huang, Hao Peng, Sen Su, and Philip S. Yu. Heterophily-agnostic hypergraph neural networks with Riemannian local exchanger. InProceedings of the ACM Web Conference (WWW), 2026

  49. [49]

    Hypergraph-MLP: Learning on hypergraphs without message passing

    Bohan Tang, Siheng Chen, and Xiaowen Dong. Hypergraph-MLP: Learning on hypergraphs without message passing. InICASSP 2024 – 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2024

  50. [50]

    Tatonetti, Patrick P

    Nicholas P. Tatonetti, Patrick P. Ye, Roxana Daneshjou, and Russ B. Altman. Data-driven prediction of drug effects and interactions.Science Translational Medicine, 2012

  51. [51]

    Hypergraph neural networks through the lens of message passing: A common perspective to homophily and architecture design.Transactions on Machine Learning Research (TMLR), 2025

    Lev Telyatnikov, Maria Sofia Bucarelli, Guillermo Bernardez, Olga Zaghen, Simone Scardapane, and Pietro Liò. Hypergraph neural networks through the lens of message passing: A common perspective to homophily and architecture design.Transactions on Machine Learning Research (TMLR), 2025

  52. [52]

    Quality evaluation of the Japanese Adverse Drug Event Report database (JADER).Pharmacoepidemiology and Drug Safety, 2020

    Masami Tsuchiya, Taku Obara, Takamasa Sakai, Kaori Nomura, Chizuko Takamura, and Nariyasu Mano. Quality evaluation of the Japanese Adverse Drug Event Report database (JADER).Pharmacoepidemiology and Drug Safety, 2020

  53. [53]

    Graph attention networks

    Petar Veliˇckovi´c, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Liò, and Yoshua Bengio. Graph attention networks. InInternational Conference on Learning Representations (ICLR), 2018

  54. [54]

    Equivariant hypergraph diffusion neural operators

    Peihao Wang, Shenghao Yang, Yunyu Liu, Zhangyang Wang, and Pan Li. Equivariant hypergraph diffusion neural operators. InInternational Conference on Learning Representations (ICLR), 2023

  55. [55]

    From hypergraph energy functions to hypergraph neural networks

    Yuxin Wang, Quan Gan, Xipeng Qiu, Xuanjing Huang, and David Wipf. From hypergraph energy functions to hypergraph neural networks. InProceedings of the 40th International Conference on Machine Learning (ICML), 2023

  56. [56]

    Chawla, Chuxu Zhang, and Yanfang Ye

    Zehong Wang, Zheyuan Zhang, Nitesh V . Chawla, Chuxu Zhang, and Yanfang Ye. GFT: Graph foundation model with transferable tree vocabulary. InAdvances in Neural Information Processing Systems (NeurIPS), 2024

  57. [57]

    Subgraph pooling: Tackling negative transfer on graphs

    Zehong Wang, Zheyuan Zhang, Chuxu Zhang, and Yanfang Ye. Subgraph pooling: Tackling negative transfer on graphs. InInternational Joint Conference on Artificial Intelligence (IJCAI), 2024

  58. [58]

    Chawla, Chuxu Zhang, and Yanfang Ye

    Zehong Wang, Zheyuan Zhang, Tianyi Ma, Nitesh V . Chawla, Chuxu Zhang, and Yanfang Ye. Beyond message passing: Neural graph pattern machine. InInternational Conference on Machine Learning (ICML), 2025. 12

  59. [59]

    Generative graph pattern machine

    Zehong Wang, Zheyuan Zhang, Tianyi Ma, Chuxu Zhang, and Yanfang Ye. Generative graph pattern machine. InAdvances in Neural Information Processing Systems (NeurIPS), 2025

  60. [60]

    HODDI: A dataset of high-order drug-drug interactions for computational pharmacovigilance.arXiv preprint arXiv:2502.06274, 2025

    Zhaoying Wang, Yingdan Shi, Xiang Liu, Can Chen, Jun Wen, and Ren Wang. HODDI: A dataset of high-order drug-drug interactions for computational pharmacovigilance.arXiv preprint arXiv:2502.06274, 2025

  61. [61]

    Jun Xia, Chengshuai Zhao, Bozhen Hu, Zhangyang Gao, Cheng Tan, Yue Liu, Siyuan Li, and Stan Z. Li. Mole-BERT: Rethinking pre-training graph neural networks for molecules. InInternational Conference on Learning Representations (ICLR), 2023

  62. [62]

    K-hop hypergraph neural network: A comprehensive aggregation approach

    Linhuang Xie, Shihao Gao, Jie Liu, Ming Yin, and Taisong Jin. K-hop hypergraph neural network: A comprehensive aggregation approach. InProceedings of the AAAI Conference on Artificial Intelligence, 2025

  63. [63]

    How powerful are graph neural networks? InInternational Conference on Learning Representations (ICLR), 2019

    Keyulu Xu, Weihua Hu, Jure Leskovec, and Stefanie Jegelka. How powerful are graph neural networks? InInternational Conference on Learning Representations (ICLR), 2019

  64. [64]

    HyperGCN: A new method for training graph convolutional networks on hypergraphs

    Naganand Yadati, Madhav Nimishakavi, Prateek Yadav, Vikram Nitin, Anand Louis, and Partha Talukdar. HyperGCN: A new method for training graph convolutional networks on hypergraphs. InAdvances in Neural Information Processing Systems (NeurIPS), 2019

  65. [65]

    Abdelzaher

    Chaoqi Yang, Ruijie Wang, Shuochao Yao, and Tarek F. Abdelzaher. Semi-supervised hypergraph node classification on hypergraph line expansion. InProceedings of the 31st ACM International Conference on Information and Knowledge Management (CIKM), 2022

  66. [66]

    GraphSynergy: a network-inspired deep learning model for anticancer drug combination prediction.Journal of the American Medical Informatics Association, 2021

    Jiannan Yang, Zhongzhi Xu, William Ka Kei Wu, Qian Chu, and Qingpeng Zhang. GraphSynergy: a network-inspired deep learning model for anticancer drug combination prediction.Journal of the American Medical Informatics Association, 2021

  67. [67]

    Do transformers really perform badly for graph representation? InAdvances in Neural Information Processing Systems (NeurIPS), 2021

    Chengxuan Ying, Tianle Cai, Shengjie Luo, Shuxin Zheng, Guolin Ke, Di He, Yanming Shen, and Tie-Yan Liu. Do transformers really perform badly for graph representation? InAdvances in Neural Information Processing Systems (NeurIPS), 2021

  68. [68]

    Manzil Zaheer, Satwik Kottur, Siamak Ravanbakhsh, Barnabás Póczos, Ruslan Salakhutdinov, and Alexan- der J. Smola. Deep sets. InAdvances in Neural Information Processing Systems (NeurIPS), 2017

  69. [69]

    Multi-view self-supervised heterogeneous graph embedding

    Jianan Zhao, Qianlong Wen, Shiyu Sun, Yanfang Ye, and Chuxu Zhang. Multi-view self-supervised heterogeneous graph embedding. InJoint European conference on machine learning and knowledge discovery in databases, pages 319–334. Springer, 2021

  70. [70]

    Self-supervised graph structure refinement for graph neural networks

    Jianan Zhao, Qianlong Wen, Mingxuan Ju, Chuxu Zhang, and Yanfang Ye. Self-supervised graph structure refinement for graph neural networks. InProceedings of the sixteenth ACM international conference on web search and data mining, pages 159–167, 2023

  71. [71]

    covering pair

    Marinka Zitnik, Monica Agrawal, and Jure Leskovec. Modeling polypharmacy side effects with graph convolutional networks.Bioinformatics, 2018. 13 A Impact Statement This work introduces the Hypergraph Pattern Machine, a framework that introducesinteraction compositionalityas a supervision target for hypergraph learning, distinguishing whether a higher- ord...

  72. [72]

    Lemma 2(Bipartite-WL collapse on Construction 1).On bothH ′ 1 andH ′ 2, bipartite-WL stabilizes at iteration 2 with a single V ′-side color and a single E ′-side color

    (Brute-force check over all720 vertex permutations confirms no isomorphism exists.) • Bipartite-WL stable colorings onH ′ 1 andH ′ 2 are pointwise identical (Lemma 2 below). Lemma 2(Bipartite-WL collapse on Construction 1).On bothH ′ 1 andH ′ 2, bipartite-WL stabilizes at iteration 2 with a single V ′-side color and a single E ′-side color. Consequently, ...

  73. [73]

    The out-degree of a neg token equals the number of observed hyperedges containing cthat have it as a2-subset. In H′ 1 at any c∈V ∗: by the 2-overlap pair structure, exactly one neg token is a 2-subset ofboth observed hyperedges (the shared 2-overlap; e.g., {1,2} at c= 1 ); the other two neg tokens are subsets of one observed hyperedge each. Out-degree mul...

  74. [74]

    The coefficients (1,−3,2) are not all zero, henceh ctx c [H′ 1]̸=h ctx c [H′ 2], establishing (A3)

    For d≥4 , the LayerNorm-images of three orthonormal unit basis vectors remain linearly independent (their non-zero coordinates lie at distinct indices), so {vcenter, vneg, vobs} are linearly independent. The coefficients (1,−3,2) are not all zero, henceh ctx c [H′ 1]̸=h ctx c [H′ 2], establishing (A3). (A1) holds because softmax / tanh / GELU / linear map...