Directed Acyclic Graph Convolutional Networks
Pith reviewed 2026-05-21 23:49 UTC · model grok-4.3
The pith
The DAG Convolutional Network uses causal graph filters to respect partial ordering when learning node representations on directed acyclic graphs.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
By defining convolutional operations via causal graph-shift operators that admit spectral representations, the DCN learns nodal features that incorporate the topological order of a DAG, an inductive bias absent from standard GNNs, and the parallel PDCN variant achieves this while decoupling model complexity from graph size.
What carries the argument
Causal graph filters, constructed from a graph-shift operator adapted to the DAG partial order, that enable directional convolution in both vertex and spectral domains.
If this is right
- The architecture can be applied directly to causal-inference and scheduling problems while preserving directional constraints.
- PDCN scales to larger DAGs without a proportional rise in parameters.
- Permutation equivariance guarantees that node relabelings do not alter the learned representations.
- The spectral formulation opens the door to frequency-domain analysis of signals on acyclic graphs.
Where Pith is reading between the lines
- The same causal-filter idea could be tested on other ordered structures such as temporal or hierarchical graphs.
- Joint training of DCN layers with causal-discovery routines might allow simultaneous structure and representation learning.
- Stability or generalization bounds derived from the spectral properties could be derived for DAG-specific tasks.
Load-bearing premise
That the causal graph filters supply an inductive bias strong enough to produce clear accuracy or efficiency gains over ordinary GNNs on actual DAG datasets.
What would settle it
A controlled test on a standard DAG benchmark in which a conventional GNN without causal filters matches or exceeds the DCN accuracy would undermine the claimed advantage of the proposed filters.
Figures
read the original abstract
Directed acyclic graphs (DAGs) are central to science and engineering applications including causal inference, scheduling, and neural architecture search. In this work, we introduce the DAG Convolutional Network (DCN), a novel graph neural network (GNN) architecture designed specifically for convolutional learning from signals supported on DAGs. The DCN leverages causal graph filters to learn nodal representations that account for the partial ordering inherent to DAGs, a strong inductive bias does not present in conventional GNNs. Unlike prior art in machine learning over DAGs, DCN builds on formal convolutional operations that admit spectral-domain representations. We further propose the Parallel DCN (PDCN), a model that feeds input DAG signals to a parallel bank of causal graph-shift operators and processes these DAG-aware features using a shared multilayer perceptron. This way, PDCN decouples model complexity from graph size while maintaining satisfactory predictive performance. The architectures' permutation equivariance and expressive power properties are also established. Comprehensive numerical tests across several tasks, datasets, and experimental conditions demonstrate that (P)DCN compares favorably with state-of-the-art baselines in terms of accuracy, robustness, and computational efficiency. These results position (P)DCN as a viable framework for deep learning from DAG-structured data that is designed from first (graph) signal processing principles.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces the DAG Convolutional Network (DCN) and Parallel DCN (PDCN) for learning nodal representations from signals on directed acyclic graphs. It constructs causal graph filters based on graph signal processing that respect the partial order of DAGs via the graph shift operator, establishes permutation equivariance and expressive power, and reports that the models compare favorably to state-of-the-art baselines in accuracy, robustness, and efficiency across multiple tasks and datasets.
Significance. If the central claims hold, the work supplies a principled GSP-derived inductive bias for partial orders that standard GNNs lack, which could benefit causal inference, scheduling, and neural architecture search. Credit is due for the formal convolutional operations with spectral representations, the proofs of equivariance and expressivity, and the PDCN design that decouples complexity from graph size while retaining performance.
major comments (2)
- [Section 5] Section 5 (Numerical Experiments): the abstract states that comprehensive tests demonstrate favorable comparison, yet the reported results lack error bars, statistical significance tests, and ablations isolating the causal filter component from other architectural choices. This is load-bearing for the claim that the partial-order bias translates into measurable gains.
- [Section 3.2] Section 3.2 (Causal Graph Filters): the spectral representation of the causal filters is presented as respecting acyclicity by construction, but the derivation does not explicitly verify that the filter coefficients remain valid under arbitrary topological orderings of the same DAG; a counter-example or invariance proof would be required.
minor comments (2)
- [Abstract] Abstract: the clause 'a strong inductive bias does not present in conventional GNNs' contains a grammatical error and should read 'a strong inductive bias that is not present in conventional GNNs'.
- [Throughout] Notation: the graph-shift operator is introduced with multiple symbols across sections; a single consistent symbol and a forward reference to its definition would improve readability.
Simulated Author's Rebuttal
We thank the referee for the careful reading and constructive comments on our manuscript. We address each major comment below and describe the revisions we will make to strengthen the empirical and theoretical sections.
read point-by-point responses
-
Referee: [Section 5] Section 5 (Numerical Experiments): the abstract states that comprehensive tests demonstrate favorable comparison, yet the reported results lack error bars, statistical significance tests, and ablations isolating the causal filter component from other architectural choices. This is load-bearing for the claim that the partial-order bias translates into measurable gains.
Authors: We agree that the current experimental section would benefit from additional statistical rigor. In the revised manuscript we will rerun all experiments over at least five independent random seeds, report mean performance together with standard-deviation error bars, and include paired t-tests (or Wilcoxon signed-rank tests where appropriate) to establish statistical significance against the strongest baselines. We will also add a dedicated ablation subsection that replaces the causal graph-shift operators with ordinary (non-causal) polynomial filters while keeping all other architectural choices fixed; the resulting performance drop will quantify the contribution of the partial-order inductive bias. These changes will appear in an expanded Section 5 and the associated appendix. revision: yes
-
Referee: [Section 3.2] Section 3.2 (Causal Graph Filters): the spectral representation of the causal filters is presented as respecting acyclicity by construction, but the derivation does not explicitly verify that the filter coefficients remain valid under arbitrary topological orderings of the same DAG; a counter-example or invariance proof would be required.
Authors: The causal filters are constructed from the adjacency matrix of the DAG, which is nilpotent under any valid topological ordering. Because the filter is ultimately applied in the vertex domain, its action on a signal is independent of the particular ordering chosen to triangularize the matrix. Nevertheless, we acknowledge that an explicit invariance argument is missing. In the revision we will insert a short lemma (with proof) in Section 3.2 showing that the output of any polynomial causal filter is identical for all topological sorts of the same DAG; the proof relies on the fact that different orderings correspond to permutation-similar matrices whose nilpotency index and spectrum remain unchanged. A brief counter-example illustrating what would break if the filter were not causal will also be added for clarity. revision: yes
Circularity Check
No significant circularity identified
full rationale
The paper constructs DCN and PDCN directly from graph signal processing principles by defining causal graph filters via the DAG shift operator that respects partial ordering. Permutation equivariance and expressive power follow as standard consequences of the convolutional construction. No step reduces a claimed prediction or first-principles result to a fitted parameter or self-citation by construction; the framework remains self-contained with independent content relative to external GSP benchmarks.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Signals on DAGs admit a well-defined partial order that can be exploited by causal graph-shift operators.
Reference graph
Works this paper leans on
-
[1]
Convolutional learning on directed acyclic graphs,
S. Rey, H. Ajorlou, and G. Mateos, “Convolutional learning on directed acyclic graphs,” inProc. Asilomar Conf. Signals, Syst., Computers, 2024, pp. 423–427
work page 2024
-
[2]
Geometric deep learning: Going beyond Euclidean data,
M. M. Bronstein, J. Bruna, Y . LeCun, A. Szlam, and P. Vandergheynst, “Geometric deep learning: Going beyond Euclidean data,” IEEE Signal Process. Mag., vol. 34, no. 4, pp. 18–42, July 2017
work page 2017
-
[3]
Graph signal processing: Overview, challenges, and ap- plications,
A. Ortega, P. Frossard, J. Kova ˇcevi´c, J. M. F. Moura, and P. Van- dergheynst, “Graph signal processing: Overview, challenges, and ap- plications,” Proc. IEEE, vol. 106, no. 5, pp. 808–828, 2018
work page 2018
-
[4]
Graph signal processing for machine learning: A review and new perspectives,
X. Dong, D. Thanou, L. Toni, M. Bronstein, and P. Frossard, “Graph signal processing for machine learning: A review and new perspectives,” IEEE Signal Process. Mag. , vol. 37, no. 6, p. 117–127, Nov. 2020
work page 2020
-
[5]
Graph signal processing: History, development, impact, and outlook,
G. Leus, A. G. Marques, J. M. Moura, A. Ortega, and D. I. Shuman, “Graph signal processing: History, development, impact, and outlook,” IEEE Signal Process. Mag. , vol. 40, no. 4, pp. 49–60, 2023
work page 2023
-
[6]
Graph neural networks: Architec- tures, stability, and transferability,
L. Ruiz, F. Gama, and A. Ribeiro, “Graph neural networks: Architec- tures, stability, and transferability,” Proc. IEEE , vol. 109, no. 5, pp. 660–682, 2021
work page 2021
-
[7]
A comprehensive survey on graph neural networks,
Z. Wu, S. Pan, F. Chen, G. Long, C. Zhang, and P. S. Yu, “A comprehensive survey on graph neural networks,” IEEE Trans. Neural Netw. Learn. Syst. , vol. 32, no. 1, pp. 4–24, 2021
work page 2021
-
[8]
Semi-supervised classification with graph convolutional networks,
T. N. Kipf and M. Welling, “Semi-supervised classification with graph convolutional networks,” in Proc. Int. Conf. Learn. Representations , 2017, pp. 1–14
work page 2017
-
[9]
How powerful are graph neural networks?
K. Xu, W. Hu, J. Leskovec, and S. Jegelka, “How powerful are graph neural networks?” in Proc. Int. Conf. Learn. Representations , 2019, pp. 1–17
work page 2019
-
[10]
P. Veli ˇckovi´c, G. Cucurull, A. Casanova, A. Romero, P. Li `o, and Y . Bengio, “Graph attention networks,” in Proc. Int. Conf. Learn. Representations, 2018, pp. 1–12
work page 2018
-
[11]
MGAE: Marginalized graph autoencoder for graph clustering,
C. Wang, S. Pan, G. Long, X. Zhu, and J. Jiang, “MGAE: Marginalized graph autoencoder for graph clustering,” inAssoc. Comput. Mach., 2017, pp. 889–898
work page 2017
-
[12]
Overparametrized deep encoder-decoder schemes for inputs and outputs defined over graphs,
S. Rey, V . M. Tenorio, S. Rozada, L. Martino, and A. G. Marques, “Overparametrized deep encoder-decoder schemes for inputs and outputs defined over graphs,” in Proc. European Signal Process. Conf. (EU- SIPCO). IEEE, 2021, pp. 855–859
work page 2021
-
[13]
A. G. Marques, S. Segarra, and G. Mateos, “Signal processing on directed graphs: The role of edge directionality when processing and learning from network data,” IEEE Signal Process. Mag., vol. 37, no. 6, pp. 99–116, 2020
work page 2020
-
[14]
Causal Fourier analysis on directed acyclic graphs and posets,
B. Seifert, C. Wendler, and M. P ¨uschel, “Causal Fourier analysis on directed acyclic graphs and posets,”IEEE Trans. Signal Process., vol. 71, pp. 3805–3820, 2023
work page 2023
- [15]
-
[16]
Identifiability of Gaussian structural equa- tion models with equal error variances,
J. Peters and P. B ¨uhlmann, “Identifiability of Gaussian structural equa- tion models with equal error variances,” Biometrika, vol. 101, no. 1, pp. 219–228, 2014
work page 2014
-
[17]
DAGs with no tears: Continuous optimization for structure learning,
X. Zheng, B. Aragam, P. K. Ravikumar, and E. P. Xing, “DAGs with no tears: Continuous optimization for structure learning,”Proc. Adv. Neural. Inf. Process. Syst. , vol. 31, 2018
work page 2018
-
[18]
CoLiDE: Concomitant linear DAG estimation,
S. S. Saboksayr, G. Mateos, and M. Tepper, “CoLiDE: Concomitant linear DAG estimation,” in Proc. Int. Conf. Learn. Representations , 2024
work page 2024
-
[19]
Multiscale causal structure learning,
G. D’Acunto, P. D. Lorenzo, and S. Barbarossa, “Multiscale causal structure learning,” Trans. Mach. Learn. Res. , pp. 1–39, 2023
work page 2023
-
[20]
A survey of machine learning for big code and naturalness,
M. Allamanis, E. T. Barr, P. Devanbu, and C. Sutton, “A survey of machine learning for big code and naturalness,” ACM Computing Surveys (CSUR), vol. 51, no. 4, pp. 1–37, 2018
work page 2018
-
[21]
Graph hypernetworks for neural architecture search,
C. Zhang, M. Ren, and R. Urtasun, “Graph hypernetworks for neural architecture search,” in Proc. Int. Conf. Learn. Representations , 2019
work page 2019
-
[22]
Discrete signal processing on meet/join lattices,
M. P ¨uschel, B. Seifert, and C. Wendler, “Discrete signal processing on meet/join lattices,” IEEE Trans. Signal Process., vol. 69, pp. 3571–3584, 2021
work page 2021
-
[23]
D-V AE: A variational autoencoder for directed acyclic graphs,
M. Zhang, S. Jiang, Z. Cui, R. Garnett, and Y . Chen, “D-V AE: A variational autoencoder for directed acyclic graphs,” in Proc. Adv. Neural. Inf. Process. Syst. , 2019
work page 2019
-
[24]
Directed acyclic graph neural networks,
V . Thost and J. Chen, “Directed acyclic graph neural networks,” in Int. Conf. Learn. Representations , 2021
work page 2021
-
[25]
A reduction of a graph to a canonical form and an algebra arising during this reduction,
B. Y . Weisfeiler and A. A. Lehman, “A reduction of a graph to a canonical form and an algebra arising during this reduction,” Nauchno- Technicheskaya Informatsia, vol. 2, no. 9, pp. 12–16, 1968
work page 1968
-
[26]
Transformers over directed acyclic graphs,
Y . Luo, V . Thost, and L. Shi, “Transformers over directed acyclic graphs,” in Proc. Adv. Neural. Inf. Process. Syst. , vol. 36, 2023, pp. 47 764–47 782
work page 2023
-
[27]
Graph filters for signal processing and machine learning on graphs,
E. Isufi, F. Gama, D. I. Shuman, and S. Segarra, “Graph filters for signal processing and machine learning on graphs,” IEEE Trans. Signal Process., vol. 72, pp. 4745–4781, 2024
work page 2024
-
[28]
Redesigning graph filter-based GNNs to relax the homophily assump- tion,
S. Rey, M. Navarro, V . M. Tenorio, S. Segarra, and A. G. Marques, “Redesigning graph filter-based GNNs to relax the homophily assump- tion,” in Proc. IEEE Intl. Conf. Acoustics, Speech and Signal Process. (ICASSP). IEEE, 2025, pp. 1–5
work page 2025
-
[29]
Algebraic structures for transitive closure,
D. J. Lehmann, “Algebraic structures for transitive closure,” Theoretical Comput. Sci., vol. 4, no. 1, pp. 59–76, 1977
work page 1977
-
[30]
On the foundations of combinatorial theory i. theory of m¨obius functions,
G.-C. Rota, “On the foundations of combinatorial theory i. theory of m¨obius functions,” Probability Theory and Related Fields , vol. 2, pp. 340–368, 1964
work page 1964
-
[31]
Untrained graph neural networks for denoising,
S. Rey, S. Segarra, R. Heckel, and A. G. Marques, “Untrained graph neural networks for denoising,” IEEE Trans. Signal Process. , vol. 70, pp. 5708–5723, 2022
work page 2022
-
[32]
Graph neural networks with parallel neighborhood aggregations for graph classification,
S. Doshi and S. P. Chepuri, “Graph neural networks with parallel neighborhood aggregations for graph classification,” IEEE Trans. Signal Process., vol. 70, pp. 4883–4896, 2022
work page 2022
-
[33]
Inductive representation learning on large graphs,
W. L. Hamilton, R. Ying, and J. Leskovec, “Inductive representation learning on large graphs,” in Proc. Adv. Neural. Inf. Process. Syst., 2017, pp. 1025–1035
work page 2017
-
[34]
Learning repre- sentations by back-propagating errors,
D. E. Rumelhart, G. E. Hinton, and R. J. Williams, “Learning repre- sentations by back-propagating errors,” Nature, vol. 323, no. 6088, pp. 533–536, 1986
work page 1986
-
[35]
Emergence of scaling in random net- works,
A.-L. Barab ´asi and R. Albert, “Emergence of scaling in random net- works,” Science, vol. 286, no. 5439, pp. 509–512, 1999
work page 1999
-
[36]
R. Opgen-Rhein and K. Strimmer, “From correlation to causation networks: A simple approximate learning algorithm and its application to high-dimensional plant gene expression data,” BMC Systems Biology, vol. 1, no. 1, p. 37, 2007
work page 2007
-
[37]
Weekly water quality data from the River Thames and its major tributaries (2009–2017),
M. J. Bowes, L. K. Armstrong, S. A. Harman, D. J. E. Nicholls, H. D. Wickham, P. M. Scarlett, and M. D. Juergens, “Weekly water quality data from the River Thames and its major tributaries (2009–2017),” 2020
work page 2009
-
[38]
Adam: A method for stochastic optimization,
D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” in Proc. Int. Conf. Learn. Representations , 2015
work page 2015
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.