pith. sign in

arxiv: 2605.16668 · v1 · pith:YGJSIKIHnew · submitted 2026-05-15 · 💻 cs.LG · cs.AI

GraViti: Graph-Level Variational Autoencoders with Relaxed Permutation Invariance

Pith reviewed 2026-05-20 19:17 UTC · model grok-4.3

classification 💻 cs.LG cs.AI
keywords graph variational autoencoderspermutation invariancemolecular generationgraph-level latent spacetransformer modelschemical validitygraph reconstruction
0
0 comments X

The pith

GraViti encodes entire graphs into compact latent vectors to recover domain rules like chemical constraints in molecules.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces GraViti, a transformer-based variational autoencoder that compresses each graph into a single latent vector. This creates a dedicated graph-level latent space that enables operations such as smooth interpolation and property-guided generation. The model learns to produce valid output graphs that obey the constraints seen in training data, such as chemical rules for molecules. It shows that enforcing permutation invariance can reduce reconstruction consistency when a canonical node ordering is available. The single-step decoding process yields state-of-the-art accuracy on large datasets while keeping generation lightweight.

Core claim

GraViti is a transformer-based graph-level variational autoencoder that maps entire graphs to compact latent vectors. This design produces a true graph-level latent space that supports smooth interpolation, property-guided search, and other downstream tasks beyond the constraints of node-level embeddings. On molecular benchmarks, GraViti learns to decode valid samples that follow the chemical constraints present in the training data, showing that the model recovers domain rules directly from graph-level representations. We also show that, in domains where a reliable canonical node ordering exists such as molecules or bayesian networks, enforcing permutation invariance can prove detrimental 1

What carries the argument

Transformer encoder-decoder that generates a single compact latent vector for each input graph and reconstructs the graph without enforcing permutation invariance.

If this is right

  • Supports smooth interpolation between different graph structures in the latent space.
  • Allows property-guided search for generating new graphs with desired attributes.
  • Recovers and applies domain rules such as chemical validity directly from the latent representations.
  • Delivers higher reconstruction accuracy on large graph datasets compared to previous approaches.
  • Simplifies graph generation through single-step decoding rather than iterative processes.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Relaxing permutation invariance may benefit generative modeling in other domains that have natural orderings, such as certain types of networks.
  • Graph-level latent spaces could lead to more consistent outputs in tasks requiring global structure preservation.
  • The approach might be extended to graphs without canonical orderings by learning an ordering as part of the model.
  • Direct recovery of rules from latent space suggests potential for more interpretable graph generative models.

Load-bearing premise

Enforcing permutation invariance is detrimental to reconstruction consistency when the data has a reliable canonical node ordering.

What would settle it

A direct comparison in which a permutation-invariant version of the same transformer architecture achieves equal or better reconstruction accuracy on the molecular benchmarks would falsify the benefit of relaxing the invariance.

Figures

Figures reproduced from arXiv: 2605.16668 by Iakovos Evdaimon, Johannes F. Lutzeyer, Konstantinos Divriotis, Michalis Vazirgiannis, Roman Bresson.

Figure 1
Figure 1. Figure 1: The architecture of our model. The data space is on the left, while the latent space is on the [PITH_FULL_IMAGE:figures/full_fig_p004_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Denoising results: each cell reports the fraction of valid molecules (from PubChem16) after [PITH_FULL_IMAGE:figures/full_fig_p007_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: LogP maximization from 2 molecules. Starting from the embedding of the top-left molecule, the [PITH_FULL_IMAGE:figures/full_fig_p008_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Interpolation between two pairs of molecules (only steps where the molecule is updated are [PITH_FULL_IMAGE:figures/full_fig_p008_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Comparison of AE and VAE optimization performance on 128 random molecules. The structure [PITH_FULL_IMAGE:figures/full_fig_p015_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Extra logP optimization plots (PubChem16). [PITH_FULL_IMAGE:figures/full_fig_p017_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Extra logP optimization plots (PubChem32). [PITH_FULL_IMAGE:figures/full_fig_p018_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Denoising results: each cell reports the fraction of valid molecules (1024 molecules from [PITH_FULL_IMAGE:figures/full_fig_p019_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Denoising results: each cell reports the fraction of valid molecules (1024 molecules from [PITH_FULL_IMAGE:figures/full_fig_p019_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Extra molecule interpolation plots (PC16) [PITH_FULL_IMAGE:figures/full_fig_p021_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: Extra molecule interpolation plots (PC32) [PITH_FULL_IMAGE:figures/full_fig_p022_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: Samples of generated molecules (PubChem16) [PITH_FULL_IMAGE:figures/full_fig_p023_12.png] view at source ↗
read the original abstract

We introduce GraViti, a transformer-based graph-level variational autoencoder that maps entire graphs to compact latent vectors. This design produces a true graph-level latent space that supports smooth interpolation, property-guided search, and other downstream tasks beyond the constraints of node-level embeddings. On molecular benchmarks, GraViti learns to decode valid samples that follow the chemical constraints present in the training data, showing that the model recovers domain rules directly from graph-level representations. We also show that, in domains where a reliable canonical node ordering exists such as molecules or bayesian networks, enforcing permutation invariance can prove detrimental for consistent reconstruction. GraViti achieves state-of-the-art reconstruction accuracy on large datasets, and provides solid generative performance. Its single-step decoding offers a lightweight alternative to more complex generation pipelines while maintaining practical sample quality.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 2 minor

Summary. The paper introduces GraViti, a transformer-based graph-level variational autoencoder that encodes entire graphs into compact latent vectors to support interpolation, property-guided generation, and other downstream tasks. On molecular benchmarks it claims to recover valid samples obeying chemical constraints from the training data, achieve state-of-the-art reconstruction accuracy, and deliver solid generative performance via single-step decoding. A central assertion is that, in domains possessing reliable canonical node orderings (molecules, Bayesian networks), enforcing permutation invariance harms consistent reconstruction; the model therefore relaxes this invariance.

Significance. If the empirical results hold after proper controls, the work offers a practical graph-level latent space and a lightweight single-step decoder that can exploit domain canonicalizations. This highlights a useful trade-off between strict invariance and reconstruction fidelity in structured data, and the recovery of domain rules directly from graph-level latents is a notable strength.

major comments (1)
  1. [Abstract and §4] Abstract and §4 (Experiments): the central claim that 'enforcing permutation invariance can prove detrimental for consistent reconstruction' in domains with canonical orderings is load-bearing, yet the reported comparisons do not isolate this factor. No control is described that trains an otherwise identical invariant model on the same canonically ordered node sequences; therefore performance gains cannot be unambiguously attributed to relaxation of invariance rather than to the canonical ordering itself or to the transformer architecture.
minor comments (2)
  1. [Abstract] Abstract: quantitative claims of 'state-of-the-art reconstruction accuracy' and 'solid generative performance' are stated without accompanying metrics, baselines, error bars, or dataset sizes, which should be summarized here for immediate assessment.
  2. [§3] Notation: the precise definition of the relaxed permutation-invariance mechanism (e.g., how the attention mask or positional encoding is modified) should be given explicitly in §3 before the experimental comparisons.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the detailed review and for identifying the need to more rigorously isolate the effect of relaxing permutation invariance. We address the major comment below and commit to revisions that strengthen the empirical support for our central claim.

read point-by-point responses
  1. Referee: [Abstract and §4] Abstract and §4 (Experiments): the central claim that 'enforcing permutation invariance can prove detrimental for consistent reconstruction' in domains with canonical orderings is load-bearing, yet the reported comparisons do not isolate this factor. No control is described that trains an otherwise identical invariant model on the same canonically ordered node sequences; therefore performance gains cannot be unambiguously attributed to relaxation of invariance rather than to the canonical ordering itself or to the transformer architecture.

    Authors: We agree that the current experimental design does not fully isolate the contribution of relaxed permutation invariance from the use of canonical orderings or the transformer architecture. Our reported results compare GraViti against prior graph VAE methods (some invariant, some not), but we lack a direct ablation using an otherwise identical invariant encoder-decoder pair trained on the same canonically ordered sequences. In the revised manuscript we will add this control experiment: we will train an invariant variant of our model (using an invariant aggregation such as sum or max pooling over node embeddings while keeping the transformer backbone and canonical ordering) and report reconstruction accuracy, validity, and other metrics on the same molecular datasets. This addition will allow readers to attribute performance differences more unambiguously to the relaxation of invariance. revision: yes

Circularity Check

0 steps flagged

No significant circularity; claims rest on empirical benchmarks rather than self-referential definitions or fitted inputs

full rationale

The paper introduces an architectural modification (relaxed permutation invariance in a transformer-based graph VAE) and reports reconstruction and generation results on molecular and other graph datasets. No derivation chain is presented that reduces a claimed prediction or first-principles result to its own inputs by construction. The assertion that permutation invariance can be detrimental in canonically ordered domains is framed as an empirical observation from model comparisons, not as a mathematical identity or a parameter fit renamed as a prediction. No self-citation load-bearing steps, ansatz smuggling, or renaming of known results appear in the provided abstract or described structure. The work is therefore self-contained against external benchmarks and replication.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract supplies no explicit free parameters, background axioms, or newly postulated entities; full manuscript would be required to enumerate any latent-dimension choices, architectural hyperparameters, or modeling assumptions.

pith-pipeline@v0.9.0 · 5688 in / 1073 out tokens · 53054 ms · 2026-05-20T19:17:45.012958+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

60 extracted references · 60 canonical work pages · 1 internal anchor

  1. [1]

    Representation learning: A review and new perspectives.IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(8):1798–1828, 2013

    Yoshua Bengio, Aaron Courville, and Pascal Vincent. Representation learning: A review and new perspectives.IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(8):1798–1828, 2013

  2. [2]

    Autoen- coders and their applications in machine learning: a survey.Artificial Intelligence Review, 57(2), February 2024

    Kamal Berahmand, Fatemeh Daneshfar, Elaheh Sadat Salehi, Yuefeng Li, and Yue Xu. Autoen- coders and their applications in machine learning: a survey.Artificial Intelligence Review, 57(2), February 2024

  3. [3]

    A comprehensive survey on design and application of autoencoder in deep learning.Applied Soft Computing, 138:110176, 2023

    Pengzhi Li, Yan Pei, and Jianqiang Li. A comprehensive survey on design and application of autoencoder in deep learning.Applied Soft Computing, 138:110176, 2023

  4. [4]

    Kingma and Max Welling

    Diederik P. Kingma and Max Welling. Auto-Encoding Variational Bayes. In2nd International Conference on Learning Representations, ICLR 2014, Banff, AB, Canada, April 14-16, 2014, Con- ference Track Proceedings, 2014

  5. [5]

    Hadi Vafaii, Dekel Galor, and Jacob L. Yates. Poisson variational autoencoder. In A. Globerson, L. Mackey, D. Belgrave, A. Fan, U. Paquet, J. Tomczak, and C. Zhang, editors,Advances in Neural Information Processing Systems, volume 37, pages 44871–44906. Curran Associates, Inc., 2024

  6. [6]

    Variational Graph Auto-Encoders

    Thomas N. Kipf and Max Welling. Variational Graph Auto-Encoders.arXiv:1611.07308 [cs, stat], November 2016. arXiv: 1611.07308

  7. [7]

    Micro and macro level graph modeling for graph variational auto-encoders

    Kiarash Zahirnia, Oliver Schulte, Parmis Naddaf, and Ke Li. Micro and macro level graph modeling for graph variational auto-encoders. InProceedings of the 36th International Conference on Neural Information Processing Systems, NIPS ’22, Red Hook, NY, USA, 2022. Curran Associates Inc

  8. [8]

    Lift your molecules: Molecular graph generation in latent euclidean space

    Mohamed Amine Ketata, Nicholas Gao, Johanna Sommer, Tom Wollschl¨ ager, and Stephan G¨ unne- mann. Lift your molecules: Molecular graph generation in latent euclidean space. InThe Thirteenth International Conference on Learning Representations, 2025

  9. [9]

    Rgcvae: relational graph conditioned variational autoencoder for molecule design.Machine Learning, 114(2), January 2025

    Davide Rigoni, Nicol` o Navarin, and Alessandro Sperduti. Rgcvae: relational graph conditioned variational autoencoder for molecule design.Machine Learning, 114(2), January 2025

  10. [10]

    Infograph: Unsupervised and semi- supervised graph-level representation learning via mutual information maximization

    Fan-Yun Sun, Jordan Hoffman, Vikas Verma, and Jian Tang. Infograph: Unsupervised and semi- supervised graph-level representation learning via mutual information maximization. InInterna- tional Conference on Learning Representations, 2019

  11. [11]

    Permutation-invariant variational autoencoder for graph-level representation learning

    Robin Winter, Frank Noe, and Djork-Arn´ e Clevert. Permutation-invariant variational autoencoder for graph-level representation learning. In M. Ranzato, A. Beygelzimer, Y. Dauphin, P.S. Liang, and J. Wortman Vaughan, editors,Advances in Neural Information Processing Systems, volume 34, pages 9559–9573. Curran Associates, Inc., 2021

  12. [12]

    The quest for the GRAph level autoencoder (GRALE)

    Paul Krzakala, Gabriel Melo, Charlotte Laclau, Florence d’Alch´ e Buc, and R´ emi Flamary. The quest for the GRAph level autoencoder (GRALE). InThe Thirty-ninth Annual Conference on Neural Information Processing Systems, 2025

  13. [13]

    Semi-supervised classification with graph convolutional networks

    Thomas N Kipf and Max Welling. Semi-supervised classification with graph convolutional networks. InProceedings of the 5th International Conference on Learning Representations, 2017

  14. [14]

    How Powerful are Graph Neural Networks? InProceedings of the 7th International Conference on Learning Representations, 2019

    Keyulu Xu, Weihua Hu, Jure Leskovec, and Stefanie Jegelka. How Powerful are Graph Neural Networks? InProceedings of the 7th International Conference on Learning Representations, 2019. 10

  15. [15]

    Attention is all you need

    Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, L ukasz Kaiser, and Illia Polosukhin. Attention is all you need. In I. Guyon, U. Von Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, editors,Advances in Neural Information Processing Systems, volume 30. Curran Associates, Inc., 2017

  16. [16]

    Graph attention networks.6th International Conference on Learning Representations, 2017

    Petar Veliˇ ckovi´ c, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Li` o, and Yoshua Bengio. Graph attention networks.6th International Conference on Learning Representations, 2017

  17. [17]

    Self-supervised graph transformer on large-scale molecular data

    Yu Rong, Yatao Bian, Tingyang Xu, Weiyang Xie, Ying WEI, Wenbing Huang, and Junzhou Huang. Self-supervised graph transformer on large-scale molecular data. In H. Larochelle, M. Ranzato, R. Hadsell, M.F. Balcan, and H. Lin, editors,Advances in Neural Information Processing Systems, volume 33, pages 12559–12571. Curran Associates, Inc., 2020

  18. [18]

    A generalization of transformer networks to graphs

    Vijay Prakash Dwivedi and Xavier Bresson. A generalization of transformer networks to graphs. AAAI Workshop on Deep Learning on Graphs: Methods and Applications, 2021

  19. [19]

    Attending to graph transformers.Transactions on Machine Learning Research, 2024

    Luis M¨ uller, Mikhail Galkin, Christopher Morris, and Ladislav Ramp´ aˇ sek. Attending to graph transformers.Transactions on Machine Learning Research, 2024

  20. [20]

    Rethinking graph transformers with spectral attention

    Devin Kreuzer, Dominique Beaini, Will Hamilton, Vincent L´ etourneau, and Prudencio Tossou. Rethinking graph transformers with spectral attention. In M. Ranzato, A. Beygelzimer, Y. Dauphin, P.S. Liang, and J. Wortman Vaughan, editors,Advances in Neural Information Processing Systems, volume 34, pages 21618–21629. Curran Associates, Inc., 2021

  21. [21]

    What are good positional encodings for directed graphs? InThe Thirteenth International Conference on Learning Representations, 2025

    Yinan Huang, Haoyu Peter Wang, and Pan Li. What are good positional encodings for directed graphs? InThe Thirteenth International Conference on Learning Representations, 2025

  22. [22]

    Recipe for a general, powerful, scalable graph transformer

    Ladislav Ramp´ aˇ sek, Mikhail Galkin, Vijay Prakash Dwivedi, Anh Tuan Luu, Guy Wolf, and Do- minique Beaini. Recipe for a general, powerful, scalable graph transformer. InProceedings of the 36th International Conference on Neural Information Processing Systems, NIPS ’22, Red Hook, NY, USA, 2022. Curran Associates Inc

  23. [23]

    Structure-aware transformer for graph representation learning

    Dexiong Chen, Leslie O’Bray, and Karsten Borgwardt. Structure-aware transformer for graph representation learning. InProceedings of the 39th International Conference on Machine Learn- ing (ICML), Proceedings of Machine Learning Research, 2022

  24. [24]

    Hamilton, and Jure Leskovec

    Rex Ying, Jiaxuan You, Christopher Morris, Xiang Ren, William L. Hamilton, and Jure Leskovec. Hierarchical graph representation learning with differentiable pooling. InProceedings of the 32nd International Conference on Neural Information Processing Systems, NIPS’18, page 4805–4815, Red Hook, NY, USA, 2018. Curran Associates Inc

  25. [25]

    Graphite: Iterative generative modeling of graphs

    Aditya Grover, Aaron Zweig, and Stefano Ermon. Graphite: Iterative generative modeling of graphs. In Kamalika Chaudhuri and Ruslan Salakhutdinov, editors,Proceedings of the 36th International Conference on Machine Learning, volume 97 ofProceedings of Machine Learning Research, pages 2434–2444. PMLR, 09–15 Jun 2019

  26. [26]

    Springer International Publishing, 2018

    Martin Simonovsky and Nikos Komodakis.GraphVAE: Towards Generation of Small Graphs Using Variational Autoencoders, page 412–422. Springer International Publishing, 2018

  27. [27]

    Towards unsupervised training of matching-based graph edit distance solver via preference-aware GAN

    Wei Huang, Hanchen Wang, Dong Wen, Shaozhen Ma, Wenjie Zhang, and Xuemin Lin. Towards unsupervised training of matching-based graph edit distance solver via preference-aware GAN. In The Thirty-ninth Annual Conference on Neural Information Processing Systems, 2026

  28. [28]

    Liu, Chunming Wu, and Shouling Ji

    Xiang Ling, Lingfei Wu, Saizhuo Wang, Tengfei Ma, Fangli Xu, Alex X. Liu, Chunming Wu, and Shouling Ji. Multilevel graph matching networks for deep graph similarity learning.IEEE Trans- actions on Neural Networks and Learning Systems, 34(2):799–813, 2023

  29. [29]

    Digress: Discrete denoising diffusion for graph generation

    Cl´ ement Vignac, Igor Krawczuk, Antoine Siraudin, Bohan Wang, Volkan Cevher, and Pascal Frossard. Digress: Discrete denoising diffusion for graph generation. InICLR. OpenReview.net, 2023

  30. [30]

    Naesseth, Max Welling, and Jan-Willem van de Meent

    Floor Eijkelboom, Grigory Bartosh, Christian A. Naesseth, Max Welling, and Jan-Willem van de Meent. Variational flow matching for graph generation. In A. Globerson, L. Mackey, D. Belgrave, A. Fan, U. Paquet, J. Tomczak, and C. Zhang, editors,Advances in Neural Information Processing Systems, volume 37, pages 11735–11764. Curran Associates, Inc., 2024. 11

  31. [31]

    Focal loss for dense object detection.IEEE Transactions on Pattern Analysis and Machine Intelligence, 42(2):318–327, 2020

    Tsung-Yi Lin, Priya Goyal, Ross Girshick, Kaiming He, and Piotr Doll´ ar. Focal loss for dense object detection.IEEE Transactions on Pattern Analysis and Machine Intelligence, 42(2):318–327, 2020

  32. [32]

    Any2graph: Deep end-to-end supervised graph prediction with an optimal transport loss

    Paul Krzakala, Junjie Yang, R´ emi Flamary, Florence d’Alch´ e Buc, Charlotte Laclau, and Matthieu Labeau. Any2graph: Deep end-to-end supervised graph prediction with an optimal transport loss. InNeural Information Processing Systems (NeurIPS), 2024

  33. [33]

    Dral, Matthias Rupp, and O

    Raghunathan Ramakrishnan, Pavlo O. Dral, Matthias Rupp, and O. Anatole von Lilienfeld. Quan- tum chemistry structures and properties of 134 kilo molecules.Scientific Data, 1(1), August 2014

  34. [34]

    Thiessen, Evan E

    Sunghwan Kim, Paul A. Thiessen, Evan E. Bolton, Jie Chen, Gang Fu, Asta Gindulyte, Lianyi Han, Jane He, Siqian He, Benjamin A. Shoemaker, Jiyao Wang, Bo Yu, Jian Zhang, and Stephen H. Bryant. Pubchem substance and compound databases.Nucleic Acids Research, 44(D1):D1202– D1213, 01 2016

  35. [35]

    Graphaf: a flow-based autoregressive model for molecular graph generation

    Chence Shi*, Minkai Xu*, Zhaocheng Zhu, Weinan Zhang, Ming Zhang, and Jian Tang. Graphaf: a flow-based autoregressive model for molecular graph generation. InInternational Conference on Learning Representations, 2020

  36. [36]

    Graphdf: A discrete flow model for molecular graph generation

    Youzhi Luo, Keqiang Yan, and Shuiwang Ji. Graphdf: A discrete flow model for molecular graph generation. InInternational Conference on Machine Learning, 2021

  37. [37]

    Score-based generative modeling of graphs via the system of stochastic differential equations

    Jaehyeong Jo, Seul Lee, and Sung Ju Hwang. Score-based generative modeling of graphs via the system of stochastic differential equations. In Kamalika Chaudhuri, Stefanie Jegelka, Le Song, Csaba Szepesvari, Gang Niu, and Sivan Sabato, editors,Proceedings of the 39th International Conference on Machine Learning, volume 162 ofProceedings of Machine Learning ...

  38. [38]

    Pygmtools: A python graph matching toolkit.Journal of Machine Learning Research, 25(33):1–7, 2024

    Runzhong Wang, Ziao Guo, Wenzheng Pan, Jiale Ma, Yikai Zhang, Nan Yang, Qi Liu, Longxuan Wei, Hanxue Zhang, Chang Liu, Zetian Jiang, Xiaokang Yang, and Junchi Yan. Pygmtools: A python graph matching toolkit.Journal of Machine Learning Research, 25(33):1–7, 2024

  39. [39]

    Graph neural networks with adaptive readouts

    David Buterez, Jon Paul Janet, Steven J Kiddle, Dino Oglic, and Pietro Li` o. Graph neural networks with adaptive readouts. In Alice H. Oh, Alekh Agarwal, Danielle Belgrave, and Kyunghyun Cho, editors,Advances in Neural Information Processing Systems, 2022

  40. [40]

    CMA-ES/pycma on Github,

    Nikolaus Hansen, Youhei Akimoto, and Petr Baudis. CMA-ES/pycma on Github. Zenodo, DOI:10.5281/zenodo.2559634, February 2019

  41. [41]

    Completely derandomized self-adaptation in evolution strategies.Evolutionary Computation, 9(2):159–195, 2001

    Nikolaus Hansen and Andreas Ostermeier. Completely derandomized self-adaptation in evolution strategies.Evolutionary Computation, 9(2):159–195, 2001

  42. [42]

    Study of the lipophilicity and admet parameters of new anticancer diquinothiazines with pharmacophore substituents.Pharmaceuticals, 17(6):725, Jun 2024

    Daria Klimoszek, Ma lgorzata Jele´ n, Ma lgorzata Do lowy, and Beata Morak-M lodawska. Study of the lipophilicity and admet parameters of new anticancer diquinothiazines with pharmacophore substituents.Pharmaceuticals, 17(6):725, Jun 2024

  43. [43]

    Drug-like properties and fraction lipophilicity index as a combined metric.ADMET and DMPK, 9(3):177–190, October 2021

    Anna Tsantili-Kakoulidou and Vassilis Demopoulos. Drug-like properties and fraction lipophilicity index as a combined metric.ADMET and DMPK, 9(3):177–190, October 2021

  44. [44]

    Anna Mozrzymas. On the hydrophobic chains effect on critical micelle concentration of cationic gem- ini surfactants using molecular connectivity indices.Monatshefte f¨ ur Chemie - Chemical Monthly, 151(4):525–531, April 2020

  45. [45]

    Effect of hydrophobic monomers with different carbon chains on the structure–activity relationship of asso- ciating polyacrylamides.Journal of Polymer Research, 31(242), Jul 2024

    Rong Yang, Xiaojuan Lai, Qiying Li, Xi Ding, Lei Wang, Xin Wen, and Yan Guo. Effect of hydrophobic monomers with different carbon chains on the structure–activity relationship of asso- ciating polyacrylamides.Journal of Polymer Research, 31(242), Jul 2024

  46. [46]

    In- terface water-induced hydrophobic carbon chain unfolding in water.Communications in Theoretical Physics, 73(5):055602, mar 2021

    Zhang Xie, Zheng Li, Gang Lou, Qing Liang, Jiang-Xing Chen, Jianlong Kou, and Gui-Na Wei. In- terface water-induced hydrophobic carbon chain unfolding in water.Communications in Theoretical Physics, 73(5):055602, mar 2021

  47. [47]

    Smith and C

    R. Smith and C. Tanford. Hydrophobicity of long chain n-alkyl carboxylic acids, as measured by their distribution between heptane and aqueous solutions.Proceedings of the National Academy of Sciences of the United States of America, 70(2):289–293, February 1973. 12

  48. [48]

    McPhedran, Rajesh Seth, and Ken G

    Kerry N. McPhedran, Rajesh Seth, and Ken G. Drouillard. Hydrophobic organic compound (hoc) partitioning behaviour to municipal wastewater colloidal organic carbon.Water Research, 47(7):2222–2230, 2013

  49. [49]

    D-vae: A variational autoencoder for directed acyclic graphs, 2019

    Muhan Zhang, Shali Jiang, Zhicheng Cui, Roman Garnett, and Yixin Chen. D-vae: A variational autoencoder for directed acyclic graphs, 2019

  50. [50]

    Learning bayesian networks with the bnlearn r package.Journal of Statistical Software, 35(3):1–22, 2010

    Marco Scutari. Learning bayesian networks with the bnlearn r package.Journal of Statistical Software, 35(3):1–22, 2010

  51. [51]

    S. L. Lauritzen and D. J. Spiegelhalter. Local computations with probabilities on graphical struc- tures and their application to expert systems.Journal of the Royal Statistical Society: Series B (Methodological), 50(2):157–194, 01 1988

  52. [52]

    On structural expressive power of graph transformers

    Wenhao Zhu, Tianyu Wen, Guojie Song, Liang Wang, and Bo Zheng. On structural expressive power of graph transformers. InProceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, KDD ’23, page 3628–3637, New York, NY, USA, 2023. Association for Computing Machinery

  53. [53]

    Aligning transformers with weisfeiler-leman

    Luis M¨ uller and Christopher Morris. Aligning transformers with weisfeiler-leman. InProceedings of the 41st International Conference on Machine Learning, ICML’24. JMLR.org, 2024

  54. [54]

    On the theoretical expressive power of graph transformers for solving graph problems.Neural Networks, 194:108112, 2026

    Giannis Nikolentzos, Dimitrios Kelesis, and Michalis Vazirgiannis. On the theoretical expressive power of graph transformers for solving graph problems.Neural Networks, 194:108112, 2026

  55. [55]

    Kusner, Brooks Paige, and Jos´ e Miguel Hern´ andez-Lobato

    Matt J. Kusner, Brooks Paige, and Jos´ e Miguel Hern´ andez-Lobato. Grammar variational autoen- coder. InProceedings of the 34th International Conference on Machine Learning - Volume 70, ICML’17, page 1945–1954. JMLR.org, 2017

  56. [56]

    Junction tree variational autoencoder for molecular graph generation

    Wengong Jin, Regina Barzilay, and Tommi Jaakkola. Junction tree variational autoencoder for molecular graph generation. In Jennifer Dy and Andreas Krause, editors,Proceedings of the 35th International Conference on Machine Learning, volume 80 ofProceedings of Machine Learning Research, pages 2323–2332. PMLR, 10–15 Jul 2018

  57. [57]

    Transformer graph variational autoencoder for generative molecular design.Biophysical Journal, 124(22):3867–3875, 2025

    Trieu Nguyen and Aleksandra Karolak. Transformer graph variational autoencoder for generative molecular design.Biophysical Journal, 124(22):3867–3875, 2025

  58. [58]

    ScaffoldGVAE: scaffold generation and hopping of drug molecules via a variational autoencoder based on multi-view graph neural networks.J

    Chao Hu, Song Li, Chenxing Yang, Jun Chen, Yi Xiong, Guisheng Fan, Hao Liu, and Liang Hong. ScaffoldGVAE: scaffold generation and hopping of drug molecules via a variational autoencoder based on multi-view graph neural networks.J. Cheminform., 15(1):91, October 2023

  59. [59]

    Pcf-vae: posterior collapse free variational autoencoder for de novo drug design.Scientific Reports, 15(1), October 2025

    Arun Singh Bhadwal, Monika Kumari, and Anil Kumar. Pcf-vae: posterior collapse free variational autoencoder for de novo drug design.Scientific Reports, 15(1), October 2025

  60. [60]

    no-edge” class from hurting the training, we actually sample a number of “no-edge

    Toshiki Ochiai, Tensei Inukai, Manato Akiyama, Kairi Furui, Masahito Ohue, Nobuaki Matsumori, Shinsuke Inuki, Motonari Uesugi, Toshiaki Sunazuka, Kazuya Kikuchi, Hideaki Kakeya, and Yasub- umi Sakakibara. Variational autoencoder-based chemical latent space for large molecular structures with 3d complexity.Communications Chemistry, 6(1), November 2023. 13 ...