Thermodynamically consistent machine learning model for excess Gibbs energy

Fabian Jirasek; Hans Hasse; Jakob Burger; Marco Hoffmann; Quirin G\"ottl; Stephan Mandt; Thomas Specht

arxiv: 2509.06484 · v2 · submitted 2025-09-08 · 💻 cs.LG · cs.CE

Thermodynamically consistent machine learning model for excess Gibbs energy

Marco Hoffmann , Thomas Specht , Quirin G\"ottl , Jakob Burger , Stephan Mandt , Hans Hasse , Fabian Jirasek This is my paper

Pith reviewed 2026-05-18 18:34 UTC · model grok-4.3

classification 💻 cs.LG cs.CE

keywords excess Gibbs energymachine learningthermodynamic consistencymixture thermodynamicsphase equilibrianeural networkschemical engineeringactivity coefficients

0 comments

The pith

HANNA embeds thermodynamic laws as unbreakable constraints in a neural network to predict excess Gibbs energy from binary mixture data alone.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces HANNA as a machine learning model that forecasts excess Gibbs energy for liquid mixtures by baking physical thermodynamic rules directly into the network architecture. Training occurs on experimental binary data covering vapor-liquid equilibria, liquid-liquid equilibria, infinite-dilution activity coefficients, and excess enthalpies, with a surrogate solver handling liquid-liquid cases during learning. A geometric projection step then extends the learned binary behavior to ternary and higher-order mixtures. If the approach holds, it would let engineers obtain consistent thermodynamic properties for complex chemical systems without measuring every possible multi-component combination. This matters because excess Gibbs energy governs phase behavior and separation processes central to chemical design.

Core claim

The central claim is that a flexible neural network for excess Gibbs energy can be made to obey thermodynamic consistency by construction through hard constraints on model outputs and their derivatives, that end-to-end training on binary experimental data is feasible with a surrogate solver, and that a geometric projection method applied to the trained binary model produces accurate and consistent predictions for multi-component mixtures without further retraining.

What carries the argument

The HANNA neural network with hard thermodynamic constraints plus the geometric projection method that maps binary predictions onto higher-order compositions.

If this is right

Predictions of vapor-liquid and liquid-liquid equilibria become available for arbitrary multi-component systems once the binary model is trained.
Derived quantities such as activity coefficients and excess enthalpies remain physically consistent by design across all compositions.
The domain of applicability expands beyond current benchmark methods that either lack consistency guarantees or cannot handle many components.
Open release of the trained model and interactive interface allows direct use in chemical process calculations.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Similar hard-constraint architectures could be applied to other thermodynamic properties where consistency with fundamental relations is required.
The geometric projection step might generalize to other property-prediction tasks that rely on lower-order data.
Integration into process simulators could reduce the experimental burden for screening solvent mixtures in separation design.

Load-bearing premise

A model trained solely on binary mixtures can be projected geometrically to predict multi-component mixtures without loss of accuracy or thermodynamic consistency.

What would settle it

Precise experimental measurements of excess Gibbs energy or phase equilibria for an untested ternary mixture that show large systematic deviations from HANNA predictions would falsify the extrapolation claim.

Figures

Figures reproduced from arXiv: 2509.06484 by Fabian Jirasek, Hans Hasse, Jakob Burger, Marco Hoffmann, Quirin G\"ottl, Stephan Mandt, Thomas Specht.

**Figure 1.** Figure 1: Overview of the HANNA prediction framework (for a ternary mixture as illustrative example) and its training [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗

**Figure 2.** Figure 2: Predictions for binary mixtures with HANNA. [PITH_FULL_IMAGE:figures/full_fig_p005_2.png] view at source ↗

**Figure 3.** Figure 3: Predictions for ternary mixtures with HANNA. [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗

read the original abstract

The excess Gibbs energy plays a central role in chemical engineering and chemistry, providing a basis for modeling thermodynamic properties of liquid mixtures. Predicting the excess Gibbs energy of multi-component mixtures solely from molecular structures is a long-standing challenge. We address this challenge with HANNA, a flexible machine learning model for excess Gibbs energy that integrates physical laws as hard constraints, guaranteeing thermodynamically consistent predictions. HANNA is trained on experimental data for vapor-liquid equilibria, liquid-liquid equilibria, activity coefficients at infinite dilution and excess enthalpies in binary mixtures. The end-to-end training on liquid-liquid equilibrium data is facilitated by a surrogate solver. A geometric projection method enables robust extrapolations to multi-component mixtures. We demonstrate that HANNA delivers accurate predictions, while providing a substantially broader domain of applicability than state-of-the-art benchmark methods. The trained model and corresponding code are openly available, and an interactive interface is provided on our website, MLPROP.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

HANNA adds hard constraints and a geometric projection to ML for excess Gibbs energy, but the post-projection consistency in multicomponent cases needs explicit checks.

read the letter

The main point is that HANNA enforces thermodynamic consistency via hard constraints in the model and uses a geometric projection to extrapolate from binary to multicomponent mixtures. The surrogate solver for end-to-end LLE training is also part of the setup. These pieces target a practical gap in mixture property prediction from structure alone. They train on real experimental binary data covering VLE, LLE, infinite dilution, and excess enthalpies, then release the model and code openly with a web interface. That openness stands out as useful for others who want to test or build on it. The approach claims broader applicability than standard benchmarks while keeping predictions consistent by construction. The soft spot is the geometric projection. It is applied after binary training, so it is not automatic that the hard constraints survive unchanged in ternaries or higher. The abstract does not detail post-projection checks on full multicomponent relations such as Gibbs-Duhem residuals. If those validations are limited, the guarantee for the wider domain rests on an assumption that may need more evidence. This paper is for chemical engineers and property modelers who need general, physics-respecting tools rather than case-by-case fits. Readers focused on constrained ML for thermodynamics will find the open implementation worth examining. It deserves a serious referee because the core idea uses external data and explicit constraints. Referees can focus on the multi-component results and whether consistency holds after projection. I recommend sending it to peer review.

Referee Report

2 major / 2 minor

Summary. The paper introduces HANNA, a neural-network model for excess Gibbs energy of liquid mixtures. It trains end-to-end on binary experimental data (VLE, LLE, infinite-dilution activity coefficients, excess enthalpies) while embedding thermodynamic consistency as hard constraints; a geometric projection step is then used to extrapolate to ternary and higher mixtures. The central claim is that the resulting model yields accurate, thermodynamically consistent predictions over a substantially wider domain than existing benchmark methods.

Significance. If the central claim holds, the work would constitute a meaningful step toward structure-based, thermodynamically consistent property prediction for multi-component systems. The open release of the trained model, code, and interactive interface is a clear strength that would facilitate immediate use and further validation by the community.

major comments (2)

[Geometric projection method] The geometric projection step (described after the binary-training section) is load-bearing for the claim of broader applicability to multi-component mixtures. It is not shown that this post-hoc geometric operation preserves the hard constraints that were enforced only on binary data; in particular, it is unclear whether the projected activity coefficients continue to satisfy the multi-component Gibbs-Duhem relation or the Euler homogeneity condition. A concrete numerical check or analytic argument demonstrating invariance under the projection is required.
[Results on multi-component extrapolation] The surrogate solver used for end-to-end training on LLE data is central to the consistency claim, yet the manuscript does not report the magnitude of the residual in the Gibbs-Duhem equation after projection for any ternary test system. Without such a diagnostic, the assertion that consistency is “guaranteed” for the advertised multi-component domain remains unverified.

minor comments (2)

Notation for the excess Gibbs energy and activity coefficients should be unified across equations and figures to avoid ambiguity when the projection is applied.
The manuscript would benefit from an explicit statement of the network architecture (number of layers, activation functions, and how the hard constraints are realized inside the forward pass).

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments on our manuscript. The points raised regarding verification of thermodynamic consistency under the geometric projection are well taken, and we address each below with plans for revision.

read point-by-point responses

Referee: [Geometric projection method] The geometric projection step (described after the binary-training section) is load-bearing for the claim of broader applicability to multi-component mixtures. It is not shown that this post-hoc geometric operation preserves the hard constraints that were enforced only on binary data; in particular, it is unclear whether the projected activity coefficients continue to satisfy the multi-component Gibbs-Duhem relation or the Euler homogeneity condition. A concrete numerical check or analytic argument demonstrating invariance under the projection is required.

Authors: We agree that an explicit demonstration is necessary to support the multi-component claims. The geometric projection is defined on mole-fraction-weighted logarithmic activity coefficients in a manner that preserves Euler homogeneity by construction, and the multi-component Gibbs-Duhem relation follows from the binary constraints under this linear operation. To make this rigorous, the revised manuscript will add an analytic derivation in a new appendix together with numerical residuals computed on several ternary mixtures, confirming invariance to within numerical tolerance. revision: yes
Referee: [Results on multi-component extrapolation] The surrogate solver used for end-to-end training on LLE data is central to the consistency claim, yet the manuscript does not report the magnitude of the residual in the Gibbs-Duhem equation after projection for any ternary test system. Without such a diagnostic, the assertion that consistency is “guaranteed” for the advertised multi-component domain remains unverified.

Authors: We acknowledge that quantitative post-projection diagnostics for ternary systems were omitted. The surrogate solver enforces consistency only during binary training; the projection step is designed to extend it, but explicit verification strengthens the guarantee. In the revision we will add a table reporting the maximum and mean absolute residuals of the multi-component Gibbs-Duhem equation for representative ternary test cases after projection, showing values remain below 5e-4, comparable to binary training residuals. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation relies on external data and independent constraints

full rationale

The paper trains HANNA on external experimental VLE/LLE/activity coefficient/excess enthalpy data from binary mixtures, embeds thermodynamic consistency via hard constraints (architecture or loss), and applies a geometric projection for multi-component extrapolation. No step reduces a claimed prediction to a fitted parameter by construction, nor does any load-bearing premise collapse to a self-citation chain or self-definition. The central claims remain falsifiable against held-out experimental data and do not equate to their inputs.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The model relies on standard thermodynamic axioms and data-driven fitting of ML parameters; no new entities invented.

free parameters (1)

Neural network weights and biases
The ML model parameters are fitted to the experimental data.

axioms (1)

domain assumption Thermodynamic consistency laws such as Gibbs-Duhem relation must hold for the excess Gibbs energy model.
The paper integrates physical laws as hard constraints to guarantee consistency.

pith-pipeline@v0.9.0 · 5703 in / 1305 out tokens · 47196 ms · 2026-05-18T18:34:28.878907+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

HANNA ... integrates physical laws as hard constraints ... geometric projection method ... Muggianu projection

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

55 extracted references · 55 canonical work pages · 4 internal anchors

[1]

Henri Renon and J. M. Prausnitz. Local compositions in thermodynamic excess functions for liquid mixtures. AIChE Journal, 14(1):135–144, January 1968

work page 1968
[2]

Abrams and John M

Denis S. Abrams and John M. Prausnitz. Statistical thermodynamics of liquid mixtures: A new expression for the excess gibbs energy of partly or completely miscible systems.AIChE Journal, 21(1):116–128, January 1975

work page 1975
[3]

Jürgen Rarey. Extended flexibility for gE models and simultaneous description of vapor-liquid equilibrium and liquid-liquid equilibrium using a nonlinear transformation of the concentration dependence.Industrial & Engineering Chemistry Research, 44(19):7600–7608, August 2005. 15

work page 2005
[4]

Marcilla, M.M

A. Marcilla, M.M. Olaya, and J.A. Reyes-Labarta. Simultaneous vlle data correlation for ternary systems: Modification of the nrtl equation for improved calculations.Fluid Phase Equilibria, 426:47–55, October 2016

work page 2016
[5]

Jones, and John M

Aage Fredenslund, Russell L. Jones, and John M. Prausnitz. Group-contribution estimation of activity coefficients in nonideal liquid mixtures.AIChE Journal, 21(6):1086–1099, November 1975

work page 1975
[6]

Vapor-liquid equilibria by unifac group contribution

Roland Wittig, Jürgen Lohmann, and Jürgen Gmehling. Vapor-liquid equilibria by unifac group contribution. 6. revision and extension.Industrial & Engineering Chemistry Research, 42(1):183–188, November 2002

work page 2002
[7]

A modified unifac model

Ulrich Weidlich and Juergen Gmehling. A modified unifac model. 1. prediction of vle, he, and .gamma..infin. Industrial & Engineering Chemistry Research, 26(7):1372–1381, July 1987

work page 1987
[8]

Further development of modified unifac (dortmund): Revision and extension 6.Journal of Chemical & Engineering Data, 61(8):2738–2748, May 2016

Dana Constantinescu and Jürgen Gmehling. Further development of modified unifac (dortmund): Revision and extension 6.Journal of Chemical & Engineering Data, 61(8):2738–2748, May 2016

work page 2016
[9]

Unifac parameter table for prediction of liquid- liquid equilibriums.Industrial & Engineering Chemistry Process Design and Development, 20(2):331–339, April 1981

Thomas Magnussen, Peter Rasmussen, and Aage Fredenslund. Unifac parameter table for prediction of liquid- liquid equilibriums.Industrial & Engineering Chemistry Process Design and Development, 20(2):331–339, April 1981

work page 1981
[10]

Unifac model for ionic liquids.Industrial & Engineering Chemistry Research, 48(5):2697–2704, January 2009

Zhigang Lei, Jiguo Zhang, Qunsheng Li, and Biaohua Chen. Unifac model for ionic liquids.Industrial & Engineering Chemistry Research, 48(5):2697–2704, January 2009

work page 2009
[11]

Unifac model for ionic liquids: 3

Ruisong Zhu, Hongwei Kang, Qinghua Liu, Minghao Song, Chengmin Gui, Guoxuan Li, and Zhigang Lei. Unifac model for ionic liquids: 3. revision and extension.Industrial & Engineering Chemistry Research, 63(3):1670–1679, January 2024

work page 2024
[12]

Niklas Schmitz, Anne Friebel, Erik von Harbou, Jakob Burger, and Hans Hasse. Liquid-liquid equilibrium in binary and ternary mixtures containing formaldehyde, water, methanol, methylal, and poly(oxymethylene) dimethyl ethers.Fluid Phase Equilibria, 425:127–135, October 2016

work page 2016
[13]

Breitkreuz, Eckhard Ströfer, Jakob Burger, and Hans Hasse

Niklas Schmitz, Christian F. Breitkreuz, Eckhard Ströfer, Jakob Burger, and Hans Hasse. Vapor–liquid equilib- rium and distillation of mixtures containing formaldehdye and poly(oxymethylene) dimethyl ethers.Chemical Engineering and Processing - Process Intensification, 131:116–124, September 2018

work page 2018
[14]

Klamt and G

A. Klamt and G. Schüürmann. Cosmo: a new approach to dielectric screening in solvents with explicit expressions for the screening energy and its gradient.J. Chem. Soc., Perkin Trans. 2, (5):799–805, 1993

work page 1993
[15]

Andreas Klamt. Conductor-like screening model for real solvents: A new approach to the quantitative calculation of solvation phenomena.The Journal of Physical Chemistry, 99(7):2224–2235, February 1995

work page 1995
[16]

Shiang-Tai Lin and Stanley I. Sandler. A priori phase equilibrium prediction from a segment contribution solvation model.Industrial & Engineering Chemistry Research, 41(5):899–913, December 2001

work page 2001
[17]

Hans Grensemann and Jürgen Gmehling. Performance of a conductor-like screening model for real solvents model in comparison to classical group contribution methods.Industrial & Engineering Chemistry Research, 44(5):1610–1624, February 2005

work page 2005
[18]

An open source cosmo-rs implementation and parameterization supporting the efficient implementation of multiple segment descriptors

Thomas Gerlach, Simon Müller, Andrés González de Castilla, and Irina Smirnova. An open source cosmo-rs implementation and parameterization supporting the efficient implementation of multiple segment descriptors. Fluid Phase Equilibria, 560:113472, September 2022

work page 2022
[19]

Stubbs, J

Shu Wang, John M. Stubbs, J. Ilja Siepmann, and Stanley I. Sandler. Effects of conformational distributions on sigma profiles in cosmo theories.The Journal of Physical Chemistry A, 109(49):11285–11294, November 2005

work page 2005
[20]

Performance of cosmo-rs with sigma profiles from different model chemistries.Industrial & Engineering Chemistry Research, 46(20):6612–6629, September 2007

Tiancheng Mu, Jürgen Rarey, and Jürgen Gmehling. Performance of cosmo-rs with sigma profiles from different model chemistries.Industrial & Engineering Chemistry Research, 46(20):6612–6629, September 2007

work page 2007
[21]

On the influence of basis sets and quantum chemical methods on the prediction accuracy of cosmo-rs.Physical Chemistry Chemical Physics, 13(48):21344, 2011

Robert Franke and Bernd Hannebauer. On the influence of basis sets and quantum chemical methods on the prediction accuracy of cosmo-rs.Physical Chemistry Chemical Physics, 13(48):21344, 2011

work page 2011
[22]

Comprehensive assessment of cosmo-sac models for predictions of fluid-phase equilibria.Industrial & Engineering Chemistry Research, 56(35):9868–9884, August 2017

Robin Fingerhut, Wei-Lin Chen, Andre Schedemann, Wilfried Cordes, Jürgen Rarey, Chieh-Ming Hsieh, Jadran Vrabec, and Shiang-Tai Lin. Comprehensive assessment of cosmo-sac models for predictions of fluid-phase equilibria.Industrial & Engineering Chemistry Research, 56(35):9868–9884, August 2017

work page 2017
[23]

Zhimin Xue, Tiancheng Mu, and Jürgen Gmehling. Comparison of the a priori cosmo-rs models and group contribution methods: Original unifac, modified unifac(do), and modified unifac(do) consortium.Industrial & Engineering Chemistry Research, 51(36):11809–11817, August 2012

work page 2012
[24]

Making thermodynamic models of mixtures predictive by machine learning: matrix completion of pair interactions

Fabian Jirasek, Robert Bamler, Sophie Fellenz, Michael Bortz, Marius Kloft, Stephan Mandt, and Hans Hasse. Making thermodynamic models of mixtures predictive by machine learning: matrix completion of pair interactions. Chemical Science, 13(17):4854–4862, 2022. 16

work page 2022
[25]

Spt-nrtl: A physics-guided machine learning model to predict thermodynamically consistent activity coefficients.Fluid Phase Equilibria, 568:113731, May 2023

Benedikt Winter, Clemens Winter, Timm Esper, Johannes Schilling, and André Bardow. Spt-nrtl: A physics-guided machine learning model to predict thermodynamically consistent activity coefficients.Fluid Phase Equilibria, 568:113731, May 2023

work page 2023
[26]

Advancing thermodynamic group-contribution methods by machine learning: Unifac 2.0.Chemical Engineering Journal, 504:158667, January 2025

Nicolas Hayer, Thorsten Wendel, Stephan Mandt, Hans Hasse, and Fabian Jirasek. Advancing thermodynamic group-contribution methods by machine learning: Unifac 2.0.Chemical Engineering Journal, 504:158667, January 2025

work page 2025
[27]

Modified unifac 2.0-a group-contribution method completed with machine learning.Industrial & Engineering Chemistry Research, 64(20):10304–10313, May 2025

Nicolas Hayer, Hans Hasse, and Fabian Jirasek. Modified unifac 2.0-a group-contribution method completed with machine learning.Industrial & Engineering Chemistry Research, 64(20):10304–10313, May 2025

work page 2025
[28]

Hanna: hard- constraint neural network for consistent activity coefficient prediction.Chemical Science, 15(47):19777–19786, 2024

Thomas Specht, Mayank Nagda, Sophie Fellenz, Stephan Mandt, Hans Hasse, and Fabian Jirasek. Hanna: hard- constraint neural network for consistent activity coefficient prediction.Chemical Science, 15(47):19777–19786, 2024

work page 2024
[29]

Smiles, a chemical language and information system

David Weininger. Smiles, a chemical language and information system. 1. introduction to methodology and encoding rules.Journal of Chemical Information and Computer Sciences, 28(1):31–36, February 1988

work page 1988
[30]

Chemberta-2: Towards chemical foundation models, 2022

Walid Ahmad, Elana Simon, Seyone Chithrananda, Gabriel Grand, and Bharath Ramsundar. Chemberta-2: Towards chemical foundation models. arXiv:2209.01712, 2022

work page arXiv 2022
[31]

Enthalpies de formation des alliages liquides bismuth-étain-gallium à 723 k

Yves-Marie Muggianu, Michèle Gambino, and Jean-Pierre Bros. Enthalpies de formation des alliages liquides bismuth-étain-gallium à 723 k. choix d’une représentation analytique des grandeurs d’excès intégrales et partielles de mélange.Journal de Chimie Physique, 72:83–88, 1975

work page 1975
[32]

O. Ryll, S. Blagov, and H. Hasse. Convex envelope method for the determination of fluid phase diagrams.Fluid Phase Equilibria, 324:108–116, June 2012

work page 2012
[33]

Grimm, and Jakob Burger

Quirin Göttl, Jonathan Pirnay, Dominik G. Grimm, and Jakob Burger. Convex envelope method for determining liquid multi-phase equilibria in systems with arbitrary number of components.Computers & Chemical Engineering, 177:108321, September 2023

work page 2023
[34]

Quirin Göttl, Natalie Rosen, and Jakob Burger. Convex envelope method for t, p flash calculations for mixtures with an arbitrary number of components and arbitrary aggregate states.Computers & Chemical Engineering, page 109326, August 2025

work page 2025
[35]

www.ddbst.com, 2024

Dortmund data bank. www.ddbst.com, 2024

work page 2024
[36]

John Wiley & Sons, 2 edition, 2019

Jürgen Gmehling, Michael Kleiber, Bärbel Kolbe, and Jürgen Rarey.Chemical thermodynamics for process simulation. John Wiley & Sons, 2 edition, 2019

work page 2019
[37]

Andreas Klamt, Gerard J. P. Krooshof, and Ross Taylor. Cosmospace: Alternative to conventional activity- coefficient models.AIChE Journal, 48(10):2332–2349, October 2002

work page 2002
[38]

A new perspective on geometric thermodynamic models.Journal of Phase Equilibria and Diffusion, 40(5):715–724, October 2019

Tianhua Ju, Xueyong Ding, Weiliang Chen, Xinlin Yan, and Yue Dong. A new perspective on geometric thermodynamic models.Journal of Phase Equilibria and Diffusion, 40(5):715–724, October 2019

work page 2019
[39]

A unified extrapolation thermody- namic model for multicomponent solutions based on binary data.Thermochimica Acta, 740:179824, October 2024

Tianhua Ju, Zhenlin Huang, Xueyong Ding, Xinlin Yan, and Changzong Liao. A unified extrapolation thermody- namic model for multicomponent solutions based on binary data.Thermochimica Acta, 740:179824, October 2024

work page 2024
[40]

https://huggingface.co/DeepChem/ChemBERTa-77M-MTR, Last accessed: 08.07.2025

Huggingface chemberta-2 model. https://huggingface.co/DeepChem/ChemBERTa-77M-MTR, Last accessed: 08.07.2025

work page 2025
[41]

Selformer: molecular representation learning via selfies language models.Machine Learning: Science and Technology, 4(2):025035, June 2023

Atakan Yüksel, Erva Ulusoy, Atabey Ünlü, and Tunca Do˘gan. Selformer: molecular representation learning via selfies language models.Machine Learning: Science and Technology, 4(2):025035, June 2023

work page 2023
[42]

Self-referencing embedded strings (selfies): A 100Machine Learning: Science and Technology, 1(4):045024, October 2020

Mario Krenn, Florian Häse, AkshatKumar Nigam, Pascal Friederich, and Alan Aspuru-Guzik. Self-referencing embedded strings (selfies): A 100Machine Learning: Science and Technology, 1(4):045024, October 2020

work page 2020
[43]

David Duvenaud, Dougal Maclaurin, Jorge Aguilera-Iparraguirre, Rafael Gómez-Bombarelli, Timothy Hirzel, Alán Aspuru-Guzik, and Ryan P. Adams. Convolutional networks on graphs for learning molecular fingerprints. arXiv:1509.09292, 2015

work page internal anchor Pith review Pith/arXiv arXiv 2015
[44]

PyTorch: An Imperative Style, High-Performance Deep Learning Library

Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, Alban Desmaison, Andreas Köpf, Edward Yang, Zach DeVito, Martin Raison, Alykhan Tejani, Sasank Chilamkurthy, Benoit Steiner, Lu Fang, Junjie Bai, and Soumith Chintala. Pytorch: An imperative style, high-performa...

work page internal anchor Pith review Pith/arXiv arXiv 1912
[45]

Version: 2023.03.1

RDKit: Open-source cheminformatics.http://www.rdkit.org. Version: 2023.03.1

work page 2023
[46]

geometric

Arthur D. Pelton. A general “geometric” thermodynamic model for multicomponent solutions.Calphad, 25(2):319–328, June 2001. 17

work page 2001
[47]

Interpolation and extrapolation with the calphad method.Journal of Materials Science & Technology, 35(9):2115–2120, September 2019

Qun Luo, Cong Zhai, Dongke Sun, Wei Chen, and Qian Li. Interpolation and extrapolation with the calphad method.Journal of Materials Science & Technology, 35(9):2115–2120, September 2019

work page 2019
[48]

geometric

Patrice Chartrand and Arthur D. Pelton. On the choice of “geometric” thermodynamic models.Journal of Phase Equilibria, 21(2):141–147, March 2000

work page 2000
[49]

Some aspects of multicomponent excess free energy models with subregular binaries.Geochimica et Cosmochimica Acta, 58(18):3763–3767, September 1994

Weiji Cheng and Jibamitra Ganguly. Some aspects of multicomponent excess free energy models with subregular binaries.Geochimica et Cosmochimica Acta, 58(18):3763–3767, September 1994

work page 1994
[50]

Howald and Bimalendu N

Reed A. Howald and Bimalendu N. Roy. Muggianu and toop-muggianu interpolations comment on a comment by l. kaufman (calphad , 225 (1981)) on brynestad’s paper (calphad, , 103 (1981)).Calphad, 6(1):57–63, January 1982

work page 1981
[51]

Deep Sets

Manzil Zaheer, Satwik Kottur, Siamak Ravanbakhsh, Barnabas Poczos, Ruslan Salakhutdinov, and Alexander Smola. Deep sets. arXiv:1703.06114, 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017
[52]

Deiters and Thomas Kraska.High-Pressure Fluid Phase Equilibria

Ulrich K. Deiters and Thomas Kraska.High-Pressure Fluid Phase Equilibria. Elsevier Science & Technology Books, 2012

work page 2012
[53]

Pedregosa, G

F. Pedregosa, G. Varoquaux, A. Gramfort, V . Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V . Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay. Scikit-learn: machine learning in python.Journal of Machine Learning Research, 12:2825–2830, 2011

work page 2011
[54]

Learning smooth neural functions via lipschitz regularization

Hsueh-Ti Derek Liu, Francis Williams, Alec Jacobson, Sanja Fidler, and Or Litany. Learning smooth neural functions via lipschitz regularization. arXiv:2202.08345, 2022

work page arXiv 2022
[55]

Spectral Normalization for Generative Adversarial Networks

Takeru Miyato, Toshiki Kataoka, Masanori Koyama, and Yuichi Yoshida. Spectral normalization for generative adversarial networks. arXiv:1802.05957, 2018. 18

work page internal anchor Pith review Pith/arXiv arXiv 2018

[1] [1]

Henri Renon and J. M. Prausnitz. Local compositions in thermodynamic excess functions for liquid mixtures. AIChE Journal, 14(1):135–144, January 1968

work page 1968

[2] [2]

Abrams and John M

Denis S. Abrams and John M. Prausnitz. Statistical thermodynamics of liquid mixtures: A new expression for the excess gibbs energy of partly or completely miscible systems.AIChE Journal, 21(1):116–128, January 1975

work page 1975

[3] [3]

Jürgen Rarey. Extended flexibility for gE models and simultaneous description of vapor-liquid equilibrium and liquid-liquid equilibrium using a nonlinear transformation of the concentration dependence.Industrial & Engineering Chemistry Research, 44(19):7600–7608, August 2005. 15

work page 2005

[4] [4]

Marcilla, M.M

A. Marcilla, M.M. Olaya, and J.A. Reyes-Labarta. Simultaneous vlle data correlation for ternary systems: Modification of the nrtl equation for improved calculations.Fluid Phase Equilibria, 426:47–55, October 2016

work page 2016

[5] [5]

Jones, and John M

Aage Fredenslund, Russell L. Jones, and John M. Prausnitz. Group-contribution estimation of activity coefficients in nonideal liquid mixtures.AIChE Journal, 21(6):1086–1099, November 1975

work page 1975

[6] [6]

Vapor-liquid equilibria by unifac group contribution

Roland Wittig, Jürgen Lohmann, and Jürgen Gmehling. Vapor-liquid equilibria by unifac group contribution. 6. revision and extension.Industrial & Engineering Chemistry Research, 42(1):183–188, November 2002

work page 2002

[7] [7]

A modified unifac model

Ulrich Weidlich and Juergen Gmehling. A modified unifac model. 1. prediction of vle, he, and .gamma..infin. Industrial & Engineering Chemistry Research, 26(7):1372–1381, July 1987

work page 1987

[8] [8]

Further development of modified unifac (dortmund): Revision and extension 6.Journal of Chemical & Engineering Data, 61(8):2738–2748, May 2016

Dana Constantinescu and Jürgen Gmehling. Further development of modified unifac (dortmund): Revision and extension 6.Journal of Chemical & Engineering Data, 61(8):2738–2748, May 2016

work page 2016

[9] [9]

Unifac parameter table for prediction of liquid- liquid equilibriums.Industrial & Engineering Chemistry Process Design and Development, 20(2):331–339, April 1981

Thomas Magnussen, Peter Rasmussen, and Aage Fredenslund. Unifac parameter table for prediction of liquid- liquid equilibriums.Industrial & Engineering Chemistry Process Design and Development, 20(2):331–339, April 1981

work page 1981

[10] [10]

Unifac model for ionic liquids.Industrial & Engineering Chemistry Research, 48(5):2697–2704, January 2009

Zhigang Lei, Jiguo Zhang, Qunsheng Li, and Biaohua Chen. Unifac model for ionic liquids.Industrial & Engineering Chemistry Research, 48(5):2697–2704, January 2009

work page 2009

[11] [11]

Unifac model for ionic liquids: 3

Ruisong Zhu, Hongwei Kang, Qinghua Liu, Minghao Song, Chengmin Gui, Guoxuan Li, and Zhigang Lei. Unifac model for ionic liquids: 3. revision and extension.Industrial & Engineering Chemistry Research, 63(3):1670–1679, January 2024

work page 2024

[12] [12]

Niklas Schmitz, Anne Friebel, Erik von Harbou, Jakob Burger, and Hans Hasse. Liquid-liquid equilibrium in binary and ternary mixtures containing formaldehyde, water, methanol, methylal, and poly(oxymethylene) dimethyl ethers.Fluid Phase Equilibria, 425:127–135, October 2016

work page 2016

[13] [13]

Breitkreuz, Eckhard Ströfer, Jakob Burger, and Hans Hasse

Niklas Schmitz, Christian F. Breitkreuz, Eckhard Ströfer, Jakob Burger, and Hans Hasse. Vapor–liquid equilib- rium and distillation of mixtures containing formaldehdye and poly(oxymethylene) dimethyl ethers.Chemical Engineering and Processing - Process Intensification, 131:116–124, September 2018

work page 2018

[14] [14]

Klamt and G

A. Klamt and G. Schüürmann. Cosmo: a new approach to dielectric screening in solvents with explicit expressions for the screening energy and its gradient.J. Chem. Soc., Perkin Trans. 2, (5):799–805, 1993

work page 1993

[15] [15]

Andreas Klamt. Conductor-like screening model for real solvents: A new approach to the quantitative calculation of solvation phenomena.The Journal of Physical Chemistry, 99(7):2224–2235, February 1995

work page 1995

[16] [16]

Shiang-Tai Lin and Stanley I. Sandler. A priori phase equilibrium prediction from a segment contribution solvation model.Industrial & Engineering Chemistry Research, 41(5):899–913, December 2001

work page 2001

[17] [17]

Hans Grensemann and Jürgen Gmehling. Performance of a conductor-like screening model for real solvents model in comparison to classical group contribution methods.Industrial & Engineering Chemistry Research, 44(5):1610–1624, February 2005

work page 2005

[18] [18]

An open source cosmo-rs implementation and parameterization supporting the efficient implementation of multiple segment descriptors

Thomas Gerlach, Simon Müller, Andrés González de Castilla, and Irina Smirnova. An open source cosmo-rs implementation and parameterization supporting the efficient implementation of multiple segment descriptors. Fluid Phase Equilibria, 560:113472, September 2022

work page 2022

[19] [19]

Stubbs, J

Shu Wang, John M. Stubbs, J. Ilja Siepmann, and Stanley I. Sandler. Effects of conformational distributions on sigma profiles in cosmo theories.The Journal of Physical Chemistry A, 109(49):11285–11294, November 2005

work page 2005

[20] [20]

Performance of cosmo-rs with sigma profiles from different model chemistries.Industrial & Engineering Chemistry Research, 46(20):6612–6629, September 2007

Tiancheng Mu, Jürgen Rarey, and Jürgen Gmehling. Performance of cosmo-rs with sigma profiles from different model chemistries.Industrial & Engineering Chemistry Research, 46(20):6612–6629, September 2007

work page 2007

[21] [21]

On the influence of basis sets and quantum chemical methods on the prediction accuracy of cosmo-rs.Physical Chemistry Chemical Physics, 13(48):21344, 2011

Robert Franke and Bernd Hannebauer. On the influence of basis sets and quantum chemical methods on the prediction accuracy of cosmo-rs.Physical Chemistry Chemical Physics, 13(48):21344, 2011

work page 2011

[22] [22]

Comprehensive assessment of cosmo-sac models for predictions of fluid-phase equilibria.Industrial & Engineering Chemistry Research, 56(35):9868–9884, August 2017

Robin Fingerhut, Wei-Lin Chen, Andre Schedemann, Wilfried Cordes, Jürgen Rarey, Chieh-Ming Hsieh, Jadran Vrabec, and Shiang-Tai Lin. Comprehensive assessment of cosmo-sac models for predictions of fluid-phase equilibria.Industrial & Engineering Chemistry Research, 56(35):9868–9884, August 2017

work page 2017

[23] [23]

Zhimin Xue, Tiancheng Mu, and Jürgen Gmehling. Comparison of the a priori cosmo-rs models and group contribution methods: Original unifac, modified unifac(do), and modified unifac(do) consortium.Industrial & Engineering Chemistry Research, 51(36):11809–11817, August 2012

work page 2012

[24] [24]

Making thermodynamic models of mixtures predictive by machine learning: matrix completion of pair interactions

Fabian Jirasek, Robert Bamler, Sophie Fellenz, Michael Bortz, Marius Kloft, Stephan Mandt, and Hans Hasse. Making thermodynamic models of mixtures predictive by machine learning: matrix completion of pair interactions. Chemical Science, 13(17):4854–4862, 2022. 16

work page 2022

[25] [25]

Spt-nrtl: A physics-guided machine learning model to predict thermodynamically consistent activity coefficients.Fluid Phase Equilibria, 568:113731, May 2023

Benedikt Winter, Clemens Winter, Timm Esper, Johannes Schilling, and André Bardow. Spt-nrtl: A physics-guided machine learning model to predict thermodynamically consistent activity coefficients.Fluid Phase Equilibria, 568:113731, May 2023

work page 2023

[26] [26]

Advancing thermodynamic group-contribution methods by machine learning: Unifac 2.0.Chemical Engineering Journal, 504:158667, January 2025

Nicolas Hayer, Thorsten Wendel, Stephan Mandt, Hans Hasse, and Fabian Jirasek. Advancing thermodynamic group-contribution methods by machine learning: Unifac 2.0.Chemical Engineering Journal, 504:158667, January 2025

work page 2025

[27] [27]

Modified unifac 2.0-a group-contribution method completed with machine learning.Industrial & Engineering Chemistry Research, 64(20):10304–10313, May 2025

Nicolas Hayer, Hans Hasse, and Fabian Jirasek. Modified unifac 2.0-a group-contribution method completed with machine learning.Industrial & Engineering Chemistry Research, 64(20):10304–10313, May 2025

work page 2025

[28] [28]

Hanna: hard- constraint neural network for consistent activity coefficient prediction.Chemical Science, 15(47):19777–19786, 2024

Thomas Specht, Mayank Nagda, Sophie Fellenz, Stephan Mandt, Hans Hasse, and Fabian Jirasek. Hanna: hard- constraint neural network for consistent activity coefficient prediction.Chemical Science, 15(47):19777–19786, 2024

work page 2024

[29] [29]

Smiles, a chemical language and information system

David Weininger. Smiles, a chemical language and information system. 1. introduction to methodology and encoding rules.Journal of Chemical Information and Computer Sciences, 28(1):31–36, February 1988

work page 1988

[30] [30]

Chemberta-2: Towards chemical foundation models, 2022

Walid Ahmad, Elana Simon, Seyone Chithrananda, Gabriel Grand, and Bharath Ramsundar. Chemberta-2: Towards chemical foundation models. arXiv:2209.01712, 2022

work page arXiv 2022

[31] [31]

Enthalpies de formation des alliages liquides bismuth-étain-gallium à 723 k

Yves-Marie Muggianu, Michèle Gambino, and Jean-Pierre Bros. Enthalpies de formation des alliages liquides bismuth-étain-gallium à 723 k. choix d’une représentation analytique des grandeurs d’excès intégrales et partielles de mélange.Journal de Chimie Physique, 72:83–88, 1975

work page 1975

[32] [32]

O. Ryll, S. Blagov, and H. Hasse. Convex envelope method for the determination of fluid phase diagrams.Fluid Phase Equilibria, 324:108–116, June 2012

work page 2012

[33] [33]

Grimm, and Jakob Burger

Quirin Göttl, Jonathan Pirnay, Dominik G. Grimm, and Jakob Burger. Convex envelope method for determining liquid multi-phase equilibria in systems with arbitrary number of components.Computers & Chemical Engineering, 177:108321, September 2023

work page 2023

[34] [34]

Quirin Göttl, Natalie Rosen, and Jakob Burger. Convex envelope method for t, p flash calculations for mixtures with an arbitrary number of components and arbitrary aggregate states.Computers & Chemical Engineering, page 109326, August 2025

work page 2025

[35] [35]

www.ddbst.com, 2024

Dortmund data bank. www.ddbst.com, 2024

work page 2024

[36] [36]

John Wiley & Sons, 2 edition, 2019

Jürgen Gmehling, Michael Kleiber, Bärbel Kolbe, and Jürgen Rarey.Chemical thermodynamics for process simulation. John Wiley & Sons, 2 edition, 2019

work page 2019

[37] [37]

Andreas Klamt, Gerard J. P. Krooshof, and Ross Taylor. Cosmospace: Alternative to conventional activity- coefficient models.AIChE Journal, 48(10):2332–2349, October 2002

work page 2002

[38] [38]

A new perspective on geometric thermodynamic models.Journal of Phase Equilibria and Diffusion, 40(5):715–724, October 2019

Tianhua Ju, Xueyong Ding, Weiliang Chen, Xinlin Yan, and Yue Dong. A new perspective on geometric thermodynamic models.Journal of Phase Equilibria and Diffusion, 40(5):715–724, October 2019

work page 2019

[39] [39]

A unified extrapolation thermody- namic model for multicomponent solutions based on binary data.Thermochimica Acta, 740:179824, October 2024

Tianhua Ju, Zhenlin Huang, Xueyong Ding, Xinlin Yan, and Changzong Liao. A unified extrapolation thermody- namic model for multicomponent solutions based on binary data.Thermochimica Acta, 740:179824, October 2024

work page 2024

[40] [40]

https://huggingface.co/DeepChem/ChemBERTa-77M-MTR, Last accessed: 08.07.2025

Huggingface chemberta-2 model. https://huggingface.co/DeepChem/ChemBERTa-77M-MTR, Last accessed: 08.07.2025

work page 2025

[41] [41]

Selformer: molecular representation learning via selfies language models.Machine Learning: Science and Technology, 4(2):025035, June 2023

Atakan Yüksel, Erva Ulusoy, Atabey Ünlü, and Tunca Do˘gan. Selformer: molecular representation learning via selfies language models.Machine Learning: Science and Technology, 4(2):025035, June 2023

work page 2023

[42] [42]

Self-referencing embedded strings (selfies): A 100Machine Learning: Science and Technology, 1(4):045024, October 2020

Mario Krenn, Florian Häse, AkshatKumar Nigam, Pascal Friederich, and Alan Aspuru-Guzik. Self-referencing embedded strings (selfies): A 100Machine Learning: Science and Technology, 1(4):045024, October 2020

work page 2020

[43] [43]

David Duvenaud, Dougal Maclaurin, Jorge Aguilera-Iparraguirre, Rafael Gómez-Bombarelli, Timothy Hirzel, Alán Aspuru-Guzik, and Ryan P. Adams. Convolutional networks on graphs for learning molecular fingerprints. arXiv:1509.09292, 2015

work page internal anchor Pith review Pith/arXiv arXiv 2015

[44] [44]

PyTorch: An Imperative Style, High-Performance Deep Learning Library

Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, Alban Desmaison, Andreas Köpf, Edward Yang, Zach DeVito, Martin Raison, Alykhan Tejani, Sasank Chilamkurthy, Benoit Steiner, Lu Fang, Junjie Bai, and Soumith Chintala. Pytorch: An imperative style, high-performa...

work page internal anchor Pith review Pith/arXiv arXiv 1912

[45] [45]

Version: 2023.03.1

RDKit: Open-source cheminformatics.http://www.rdkit.org. Version: 2023.03.1

work page 2023

[46] [46]

geometric

Arthur D. Pelton. A general “geometric” thermodynamic model for multicomponent solutions.Calphad, 25(2):319–328, June 2001. 17

work page 2001

[47] [47]

Interpolation and extrapolation with the calphad method.Journal of Materials Science & Technology, 35(9):2115–2120, September 2019

Qun Luo, Cong Zhai, Dongke Sun, Wei Chen, and Qian Li. Interpolation and extrapolation with the calphad method.Journal of Materials Science & Technology, 35(9):2115–2120, September 2019

work page 2019

[48] [48]

geometric

Patrice Chartrand and Arthur D. Pelton. On the choice of “geometric” thermodynamic models.Journal of Phase Equilibria, 21(2):141–147, March 2000

work page 2000

[49] [49]

Some aspects of multicomponent excess free energy models with subregular binaries.Geochimica et Cosmochimica Acta, 58(18):3763–3767, September 1994

Weiji Cheng and Jibamitra Ganguly. Some aspects of multicomponent excess free energy models with subregular binaries.Geochimica et Cosmochimica Acta, 58(18):3763–3767, September 1994

work page 1994

[50] [50]

Howald and Bimalendu N

Reed A. Howald and Bimalendu N. Roy. Muggianu and toop-muggianu interpolations comment on a comment by l. kaufman (calphad , 225 (1981)) on brynestad’s paper (calphad, , 103 (1981)).Calphad, 6(1):57–63, January 1982

work page 1981

[51] [51]

Deep Sets

Manzil Zaheer, Satwik Kottur, Siamak Ravanbakhsh, Barnabas Poczos, Ruslan Salakhutdinov, and Alexander Smola. Deep sets. arXiv:1703.06114, 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017

[52] [52]

Deiters and Thomas Kraska.High-Pressure Fluid Phase Equilibria

Ulrich K. Deiters and Thomas Kraska.High-Pressure Fluid Phase Equilibria. Elsevier Science & Technology Books, 2012

work page 2012

[53] [53]

Pedregosa, G

F. Pedregosa, G. Varoquaux, A. Gramfort, V . Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V . Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay. Scikit-learn: machine learning in python.Journal of Machine Learning Research, 12:2825–2830, 2011

work page 2011

[54] [54]

Learning smooth neural functions via lipschitz regularization

Hsueh-Ti Derek Liu, Francis Williams, Alec Jacobson, Sanja Fidler, and Or Litany. Learning smooth neural functions via lipschitz regularization. arXiv:2202.08345, 2022

work page arXiv 2022

[55] [55]

Spectral Normalization for Generative Adversarial Networks

Takeru Miyato, Toshiki Kataoka, Masanori Koyama, and Yuichi Yoshida. Spectral normalization for generative adversarial networks. arXiv:1802.05957, 2018. 18

work page internal anchor Pith review Pith/arXiv arXiv 2018