Feature Augmentation of GNNs for ILPs: Local Uniqueness Suffices

Linxin Yang; Qian Chen; Qian Li; Qingjiang Shi; Qingyu Han; Ruoyu Sun

arxiv: 2509.21000 · v2 · submitted 2025-09-25 · 💻 cs.LG · math.OC

Feature Augmentation of GNNs for ILPs: Local Uniqueness Suffices

Qingyu Han , Qian Li , Linxin Yang , Qian Chen , Qingjiang Shi , Ruoyu Sun This is my paper

Pith reviewed 2026-05-18 14:18 UTC · model grok-4.3

classification 💻 cs.LG math.OC

keywords Graph Neural NetworksInteger Linear ProgramsLearning to OptimizeLocal Unique IdentifiersFeature AugmentationExpressivenessGeneralizationd-hop Coloring

0 comments

The pith

For d-layer GNNs on integer linear programs, local d-hop unique IDs achieve the same expressiveness as global IDs but with stronger generalization.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that augmenting Graph Neural Networks with identifiers unique only within each node's d-hop neighborhood lets them match the expressive power of globally unique identifiers for integer linear programs. This matters because global identifiers boost expressiveness but create spurious correlations that harm generalization when learning to optimize. The authors prove that d-layer networks with these Local-UIDs, based on d-hop uniqueness coloring, reach equivalent power while improving generalization. They introduce ColorGNN using color-conditioned embeddings and a lighter ColorUID variant. Experiments show substantial and robust gains across ILP benchmarks.

Core claim

Local-UIDs based on d-hop uniqueness coloring enable d-layer GNNs to attain the expressive power of Global-UIDs for ILP solving while offering stronger generalization by mitigating spurious correlations from global identifiers.

What carries the argument

d-hop uniqueness coloring that assigns Local-UIDs unique only in each node's d-hop neighborhood, incorporated through color-conditioned embeddings in ColorGNN.

If this is right

For d-layer networks, Local-UIDs achieve the expressive power of Global-UIDs.
ColorGNN and ColorUID yield substantial and robust gains across ILP benchmarks.
The local scheme improves generalization compared to global identifiers in learning-to-optimize settings.
Local uniqueness provides a parsimonious enhancement without the drawbacks of global UIDs.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same local uniqueness principle could apply to other graph-based combinatorial optimization tasks.
Selecting the hop count d might be tuned to the diameter or structure of graphs arising from particular ILP families.
This suggests testing whether local augmentation suffices when scaling to very large ILP instances.

Load-bearing premise

Modeling ILPs as graphs allows d-hop local uniqueness to capture all necessary distinctions without losing critical global structure information needed for the optimization task.

What would settle it

An ILP instance or benchmark where a d-layer ColorGNN with Local-UIDs produces worse solutions or generalizes more poorly than one with Global-UIDs.

read the original abstract

Integer Linear Programs (ILPs) are central to real-world optimizations but notoriously difficult to solve. Learning to Optimize (L2O) has emerged as a promising paradigm, with Graph Neural Networks (GNNs) serving as the standard backbone. However, standard anonymous GNNs are limited in expressiveness for ILPs, and the common enhancement of augmenting nodes with globally unique identifiers (UIDs) typically introduces spurious correlations that severely harm generalization. To address this tradeoff, we propose a parsimonious Local-UID scheme based on d-hop uniqueness coloring, which ensures identifiers are unique only within each node's d-hop neighborhood. Building on this scheme, we introduce ColorGNN, which incorporates color information via color-conditioned embeddings, and ColorUID, a lightweight feature-level variant. We prove that for d-layer networks, Local-UIDs achieve the expressive power of Global-UIDs while offering stronger generalization. Extensive experiments show that our approach yields substantial and robust gains across ILP benchmarks.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Local d-hop uniqueness gives d-layer GNNs the same ILP expressiveness as global UIDs but with better generalization.

read the letter

The main thing to know is that this paper shows local uniqueness inside each node's d-hop neighborhood is enough for d-layer GNNs to match the distinguishing power of global UIDs on ILP graphs, while avoiding the generalization penalty that global IDs usually bring. They do this with a d-hop uniqueness coloring scheme, then build two variants: ColorGNN that feeds the colors into conditioned embeddings and ColorUID that adds them as simple features. The proof follows from the fact that message passing only ever aggregates within the d-hop receptive field, so local breaks of symmetry are sufficient. That part lines up with how GNNs actually work. Experiments report clear gains on standard ILP benchmarks, which is the practical payoff. The approach is parsimonious and directly targets the spurious-correlation issue with global UIDs. One soft spot is that the paper needs to show the coloring step stays cheap on larger instances; if it scales linearly it is fine, but any hidden quadratic cost would limit use. Another is whether the bipartite variable-constraint graphs used for ILPs ever require distinctions outside the d-hop ball for the optimization task itself. The modeling choice looks reasonable, but a quick ablation on graph diameter would settle it. Readers working on learning-to-optimize with GNNs will find this useful because it gives a concrete, theoretically justified tweak rather than another global trick. It is solid enough on both the proof sketch and the empirical side to deserve a serious referee, even if the experiments need tighter statistical reporting in revision.

Referee Report

0 major / 3 minor

Summary. The paper proposes augmenting GNNs for learning to solve Integer Linear Programs (ILPs) with a Local-UID scheme based on d-hop uniqueness coloring. This ensures identifiers are unique only within each node's d-hop neighborhood. The authors introduce ColorGNN (color-conditioned embeddings) and ColorUID (feature-level variant), prove that for d-layer networks Local-UIDs match the expressive power of Global-UIDs while improving generalization, and report substantial experimental gains across ILP benchmarks.

Significance. If the claims hold, the work is significant for resolving the expressiveness-generalization tradeoff in GNN-based learning to optimize. The formal proof of equivalence specifically for d-layer networks (matching the receptive field of message passing) and the empirical robustness across benchmarks represent clear strengths that could influence GNN design for other combinatorial tasks.

minor comments (3)

[Abstract] Abstract: the term 'parsimonious Local-UID scheme' and 'd-hop uniqueness coloring' are used without a one-sentence definition, which would help readers unfamiliar with the approach.
[§5] §5 (Experiments): the main results tables should explicitly state the number of random seeds, data split protocol, and whether gains are statistically significant (e.g., via paired t-tests) to support the 'robust gains' claim.
[§3] Notation: the distinction between ColorGNN and ColorUID is clear in the method section but the precise way color information is injected into the embedding (e.g., concatenation vs. conditioning) could be stated in a single equation for quick reference.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive summary, recognition of the significance of our results on the expressiveness-generalization tradeoff, and the recommendation for minor revision. We appreciate the constructive feedback and address the points below.

Circularity Check

0 steps flagged

No significant circularity; derivation is self-contained

full rationale

The paper's central claim is a proof that d-layer GNNs with local d-hop uniqueness identifiers match the expressive power of global UIDs. This follows directly from the standard receptive-field property of message-passing GNNs: a d-layer network only aggregates information within each node's d-hop neighborhood, so uniqueness inside that neighborhood is sufficient to break symmetries in exactly the same way global identifiers would. The modeling of ILPs as bipartite graphs does not introduce any additional global-structure requirement beyond the receptive field. No load-bearing step reduces to a fitted parameter, a self-citation chain, or a definition that presupposes the target result. The local-UID scheme is motivated independently by the desire to avoid spurious correlations from global identifiers, and the proof is presented as a direct consequence of the d-layer architecture rather than by construction or prior author work.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claim rests on standard assumptions about GNN limitations on ILPs and the graph modeling of ILPs; no free parameters or invented entities are explicitly introduced in the abstract.

axioms (2)

domain assumption Anonymous GNNs have limited expressiveness for ILPs
Invoked in the abstract as the starting limitation that motivates augmentation.
domain assumption Global UIDs introduce spurious correlations harming generalization
Stated as the typical drawback of the common enhancement approach.

pith-pipeline@v0.9.0 · 5714 in / 1223 out tokens · 42770 ms · 2026-05-18T14:18:36.199634+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/AbsoluteFloorClosure.lean absolute_floor_iff_bare_distinguishability unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We prove that for d-layer networks, Local-UIDs achieve the expressive power of Global-UIDs while offering stronger generalization... d-hop unique coloring... equivalent to a (2d)-hop coloring

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

20 extracted references · 20 canonical work pages · 2 internal anchors

[1]

On the utilization of unique node identifiers in graph neural networks.arXiv preprint arXiv:2411.02271,

Maya Bechler-Speicher, Moshe Eliasof, Carola-Bibiane Sch¨onlieb, Ran Gilad-Bachrach, and Amir Globerson. On the utilization of unique node identifiers in graph neural networks.arXiv preprint arXiv:2411.02271,

work page arXiv
[2]

Branch and bound methods.Notes for EE364b, Stanford University, 2006:07,

Stephen Boyd and Jacob Mattingley. Branch and bound methods.Notes for EE364b, Stanford University, 2006:07,

work page 2006
[3]

When gnns meet symmetry in ilps: an orbit-based feature augmenta- tion approach.arXiv preprint arXiv:2501.14211,

Qian Chen, Lei Li, Qian Li, Jianghua Wu, Akang Wang, Ruoyu Sun, Xiaodong Luo, Tsung-Hui Chang, and Qingjiang Shi. When gnns meet symmetry in ilps: an orbit-based feature augmenta- tion approach.arXiv preprint arXiv:2501.14211,

work page arXiv
[4]

On representing linear pro- grams by graph neural networks.arXiv preprint arXiv:2209.12288,

Ziang Chen, Jialin Liu, Xinshang Wang, Jianfeng Lu, and Wotao Yin. On representing linear pro- grams by graph neural networks.arXiv preprint arXiv:2209.12288,

work page arXiv
[5]

On representing mixed-integer linear programs by graph neural networks

Ziang Chen, Jialin Liu, Xinshang Wang, and Wotao Yin. On representing mixed-integer linear programs by graph neural networks. InThe Eleventh International Conference on Learning Rep- resentations, ICLR 2023, Kigali, Rwanda, May 1-5,

work page 2023
[6]

A generalization of transformer networks to graphs, 2021

OpenReview.net, 2023a. Ziang Chen, Jialin Liu, Xinshang Wang, and Wotao Yin. On representing mixed-integer linear programs by graph neural networks. InThe Eleventh International Conference on Learning Rep- resentations, 2023b. URLhttps://openreview.net/forum?id=4gc3MGZra1d. Vijay Prakash Dwivedi and Xavier Bresson. A generalization of transformer networks...

work page arXiv 2012
[7]

arXiv preprint arXiv:2110.07875 , year=

Vijay Prakash Dwivedi, Anh Tuan Luu, Thomas Laurent, Yoshua Bengio, and Xavier Bresson. Graph neural networks with learnable structural and positional representations.arXiv preprint arXiv:2110.07875,

work page arXiv
[8]

Anonymous networks: ran- domization= 2-hop coloring

Yuval Emek, Christoph Pfister, Jochen Seidel, and Roger Wattenhofer. Anonymous networks: ran- domization= 2-hop coloring. InProceedings of the 2014 ACM symposium on Principles of dis- tributed computing, pp. 96–105,

work page 2014
[9]

The machine learning for combinatorial optimization competition (ml4co): Results and insights

Maxime Gasse, Simon Bowly, Quentin Cappart, Jonas Charfreitag, Laurent Charlin, Didier Ch´etelat, Antonia Chmiela, Justin Dumouchelle, Ambros Gleixner, Aleksandr M Kazachkov, et al. The machine learning for combinatorial optimization competition (ml4co): Results and insights. In NeurIPS 2021 competitions and demonstrations track, pp. 220–231. PMLR,

work page 2021
[10]

A gnn-guided predict-and-search framework for mixed-integer linear programming

Qingyu Han, Linxin Yang, Qian Chen, Xiang Zhou, Dong Zhang, Akang Wang, Ruoyu Sun, and Xi- aodong Luo. A gnn-guided predict-and-search framework for mixed-integer linear programming. arXiv preprint arXiv:2302.05636,

work page arXiv
[11]

On the stability of expressive positional encodings for graphs.arXiv preprint arXiv:2310.02579,

Yinan Huang, William Lu, Joshua Robinson, Yu Yang, Muhan Zhang, Stefanie Jegelka, and Pan Li. On the stability of expressive positional encodings for graphs.arXiv preprint arXiv:2310.02579,

work page arXiv
[12]

Learning efficient positional encodings with graph neural networks.arXiv preprint arXiv:2502.01122,

Charilaos I Kanatsoulis, Evelyn Choi, Stephanie Jegelka, Jure Leskovec, and Alejandro Ribeiro. Learning efficient positional encodings with graph neural networks.arXiv preprint arXiv:2502.01122,

work page arXiv
[13]

Semi-Supervised Classification with Graph Convolutional Networks

TN Kipf. Semi-supervised classification with graph convolutional networks.arXiv preprint arXiv:1609.02907,

work page internal anchor Pith review Pith/arXiv arXiv
[14]

Pdhg-unrolled learning-to-optimize method for large-scale linear programming

Bingheng Li, Linxin Yang, Yupeng Chen, Senmiao Wang, Haitao Mao, Qian Chen, Yao Ma, Akang Wang, Tian Ding, Jiliang Tang, and Ruoyu Sun. Pdhg-unrolled learning-to-optimize method for large-scale linear programming. InForty-first International Conference on Machine Learning, ICML 2024, Vienna, Austria, July 21-27,

work page 2024
[15]

Sign and basis invariant networks for spectral graph representation learning.arXiv preprint arXiv:2202.13013,

Derek Lim, Joshua Robinson, Lingxiao Zhao, Tess Smidt, Suvrit Sra, Haggai Maron, and Stefanie Jegelka. Sign and basis invariant networks for spectral graph representation learning.arXiv preprint arXiv:2202.13013,

work page arXiv
[16]

Apollo-milp: An alternating prediction-correction neural solving framework for mixed- integer linear programming.arXiv preprint arXiv:2503.01129,

Haoyang Liu, Jie Wang, Zijie Geng, Xijun Li, Yuxuan Zong, Fangzhou Zhu, Jianye Hao, and Feng Wu. Apollo-milp: An alternating prediction-correction neural solving framework for mixed- integer linear programming.arXiv preprint arXiv:2503.01129,

work page arXiv
[17]

What graph neural networks cannot learn: Depth vs width

11 Andreas Loukas. What graph neural networks cannot learn: Depth vs width. In8th International Conference on Learning Representations, ICLR 2020,

work page 2020
[18]

Solving mixed integer programs using neural networks.arXiv preprint arXiv:2012.13349,

Vinod Nair, Sergey Bartunov, Felix Gimeno, Ingrid V on Glehn, Pawel Lichocki, Ivan Lobov, Bren- dan O’Donoghue, Nicolas Sonnerat, Christian Tjandraatmadja, Pengming Wang, et al. Solving mixed integer programs using neural networks.arXiv preprint arXiv:2012.13349,

work page arXiv 2012
[19]

How Powerful are Graph Neural Networks?

Keyulu Xu, Weihua Hu, Jure Leskovec, and Stefanie Jegelka. How powerful are graph neural networks?arXiv preprint arXiv:1810.00826,

work page internal anchor Pith review Pith/arXiv arXiv
[20]

Graph Transformers with Laplacian-eigenvector absolute positional encodings (LapPE) were introduced by Dwivedi & Bresson (2020)

A RELATED WORKS Positional encodings for GNNsThe following PE methods arecomplementaryto our color-based UID scheme, and we therefore regard them as orthogonal components rather than head-to-head baselines. Graph Transformers with Laplacian-eigenvector absolute positional encodings (LapPE) were introduced by Dwivedi & Bresson (2020). Lim et al. (2022) des...

work page 2020

[1] [1]

On the utilization of unique node identifiers in graph neural networks.arXiv preprint arXiv:2411.02271,

Maya Bechler-Speicher, Moshe Eliasof, Carola-Bibiane Sch¨onlieb, Ran Gilad-Bachrach, and Amir Globerson. On the utilization of unique node identifiers in graph neural networks.arXiv preprint arXiv:2411.02271,

work page arXiv

[2] [2]

Branch and bound methods.Notes for EE364b, Stanford University, 2006:07,

Stephen Boyd and Jacob Mattingley. Branch and bound methods.Notes for EE364b, Stanford University, 2006:07,

work page 2006

[3] [3]

When gnns meet symmetry in ilps: an orbit-based feature augmenta- tion approach.arXiv preprint arXiv:2501.14211,

Qian Chen, Lei Li, Qian Li, Jianghua Wu, Akang Wang, Ruoyu Sun, Xiaodong Luo, Tsung-Hui Chang, and Qingjiang Shi. When gnns meet symmetry in ilps: an orbit-based feature augmenta- tion approach.arXiv preprint arXiv:2501.14211,

work page arXiv

[4] [4]

On representing linear pro- grams by graph neural networks.arXiv preprint arXiv:2209.12288,

Ziang Chen, Jialin Liu, Xinshang Wang, Jianfeng Lu, and Wotao Yin. On representing linear pro- grams by graph neural networks.arXiv preprint arXiv:2209.12288,

work page arXiv

[5] [5]

On representing mixed-integer linear programs by graph neural networks

Ziang Chen, Jialin Liu, Xinshang Wang, and Wotao Yin. On representing mixed-integer linear programs by graph neural networks. InThe Eleventh International Conference on Learning Rep- resentations, ICLR 2023, Kigali, Rwanda, May 1-5,

work page 2023

[6] [6]

A generalization of transformer networks to graphs, 2021

OpenReview.net, 2023a. Ziang Chen, Jialin Liu, Xinshang Wang, and Wotao Yin. On representing mixed-integer linear programs by graph neural networks. InThe Eleventh International Conference on Learning Rep- resentations, 2023b. URLhttps://openreview.net/forum?id=4gc3MGZra1d. Vijay Prakash Dwivedi and Xavier Bresson. A generalization of transformer networks...

work page arXiv 2012

[7] [7]

arXiv preprint arXiv:2110.07875 , year=

Vijay Prakash Dwivedi, Anh Tuan Luu, Thomas Laurent, Yoshua Bengio, and Xavier Bresson. Graph neural networks with learnable structural and positional representations.arXiv preprint arXiv:2110.07875,

work page arXiv

[8] [8]

Anonymous networks: ran- domization= 2-hop coloring

Yuval Emek, Christoph Pfister, Jochen Seidel, and Roger Wattenhofer. Anonymous networks: ran- domization= 2-hop coloring. InProceedings of the 2014 ACM symposium on Principles of dis- tributed computing, pp. 96–105,

work page 2014

[9] [9]

The machine learning for combinatorial optimization competition (ml4co): Results and insights

Maxime Gasse, Simon Bowly, Quentin Cappart, Jonas Charfreitag, Laurent Charlin, Didier Ch´etelat, Antonia Chmiela, Justin Dumouchelle, Ambros Gleixner, Aleksandr M Kazachkov, et al. The machine learning for combinatorial optimization competition (ml4co): Results and insights. In NeurIPS 2021 competitions and demonstrations track, pp. 220–231. PMLR,

work page 2021

[10] [10]

A gnn-guided predict-and-search framework for mixed-integer linear programming

Qingyu Han, Linxin Yang, Qian Chen, Xiang Zhou, Dong Zhang, Akang Wang, Ruoyu Sun, and Xi- aodong Luo. A gnn-guided predict-and-search framework for mixed-integer linear programming. arXiv preprint arXiv:2302.05636,

work page arXiv

[11] [11]

On the stability of expressive positional encodings for graphs.arXiv preprint arXiv:2310.02579,

Yinan Huang, William Lu, Joshua Robinson, Yu Yang, Muhan Zhang, Stefanie Jegelka, and Pan Li. On the stability of expressive positional encodings for graphs.arXiv preprint arXiv:2310.02579,

work page arXiv

[12] [12]

Learning efficient positional encodings with graph neural networks.arXiv preprint arXiv:2502.01122,

Charilaos I Kanatsoulis, Evelyn Choi, Stephanie Jegelka, Jure Leskovec, and Alejandro Ribeiro. Learning efficient positional encodings with graph neural networks.arXiv preprint arXiv:2502.01122,

work page arXiv

[13] [13]

Semi-Supervised Classification with Graph Convolutional Networks

TN Kipf. Semi-supervised classification with graph convolutional networks.arXiv preprint arXiv:1609.02907,

work page internal anchor Pith review Pith/arXiv arXiv

[14] [14]

Pdhg-unrolled learning-to-optimize method for large-scale linear programming

Bingheng Li, Linxin Yang, Yupeng Chen, Senmiao Wang, Haitao Mao, Qian Chen, Yao Ma, Akang Wang, Tian Ding, Jiliang Tang, and Ruoyu Sun. Pdhg-unrolled learning-to-optimize method for large-scale linear programming. InForty-first International Conference on Machine Learning, ICML 2024, Vienna, Austria, July 21-27,

work page 2024

[15] [15]

Sign and basis invariant networks for spectral graph representation learning.arXiv preprint arXiv:2202.13013,

Derek Lim, Joshua Robinson, Lingxiao Zhao, Tess Smidt, Suvrit Sra, Haggai Maron, and Stefanie Jegelka. Sign and basis invariant networks for spectral graph representation learning.arXiv preprint arXiv:2202.13013,

work page arXiv

[16] [16]

Apollo-milp: An alternating prediction-correction neural solving framework for mixed- integer linear programming.arXiv preprint arXiv:2503.01129,

Haoyang Liu, Jie Wang, Zijie Geng, Xijun Li, Yuxuan Zong, Fangzhou Zhu, Jianye Hao, and Feng Wu. Apollo-milp: An alternating prediction-correction neural solving framework for mixed- integer linear programming.arXiv preprint arXiv:2503.01129,

work page arXiv

[17] [17]

What graph neural networks cannot learn: Depth vs width

11 Andreas Loukas. What graph neural networks cannot learn: Depth vs width. In8th International Conference on Learning Representations, ICLR 2020,

work page 2020

[18] [18]

Solving mixed integer programs using neural networks.arXiv preprint arXiv:2012.13349,

Vinod Nair, Sergey Bartunov, Felix Gimeno, Ingrid V on Glehn, Pawel Lichocki, Ivan Lobov, Bren- dan O’Donoghue, Nicolas Sonnerat, Christian Tjandraatmadja, Pengming Wang, et al. Solving mixed integer programs using neural networks.arXiv preprint arXiv:2012.13349,

work page arXiv 2012

[19] [19]

How Powerful are Graph Neural Networks?

Keyulu Xu, Weihua Hu, Jure Leskovec, and Stefanie Jegelka. How powerful are graph neural networks?arXiv preprint arXiv:1810.00826,

work page internal anchor Pith review Pith/arXiv arXiv

[20] [20]

Graph Transformers with Laplacian-eigenvector absolute positional encodings (LapPE) were introduced by Dwivedi & Bresson (2020)

A RELATED WORKS Positional encodings for GNNsThe following PE methods arecomplementaryto our color-based UID scheme, and we therefore regard them as orthogonal components rather than head-to-head baselines. Graph Transformers with Laplacian-eigenvector absolute positional encodings (LapPE) were introduced by Dwivedi & Bresson (2020). Lim et al. (2022) des...

work page 2020