Flock: A Knowledge Graph Foundation Model via Learning on Random Walks

arxiv: 2510.01510 · v3 · submitted 2025-10-01 · 💻 cs.LG

Flock: A Knowledge Graph Foundation Model via Learning on Random Walks

Jinwoo Kim , Xingyue Huang , Krzysztof Olejniczak , Kyungbin Min , Michael Bronstein , Seunghoon Hong , \.Ismail \.Ilkan Ceylan This is my paper

Pith reviewed 2026-05-18 10:11 UTC · model grok-4.3

classification 💻 cs.LG

keywords knowledge graphfoundation modelrandom walklink predictionequivariancezero-shot learninguniversal approximatorPetals dataset

0 comments p. Extension

The pith

Flock learns on random walks to enforce probabilistic node-relation equivariance and serve as a universal approximator for isomorphism-invariant link functions on knowledge graphs.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper aims to show that deterministic equivariance restricts knowledge graph foundation models from telling apart relations that share the same structure but carry different meanings. By shifting to probabilistic node-relation equivariance, a model can keep distributional invariance while using structured randomness at inference to break unwanted symmetries. Flock puts this into practice by repeatedly sampling random walks, turning the walks into sequences, embedding the sequences with a sequence model, and then aggregating node and relation features through learned pooling. This construction makes the model a universal approximator for link-level functions that remain unchanged under graph isomorphism. If the approach holds, models could transfer structural knowledge to entirely new knowledge graphs containing unseen entities and relations.

Core claim

Flock is a knowledge graph foundation model that iteratively samples random walks, encodes them into sequences, embeds the sequences with a sequence model, and aggregates node and relation representations through learned pooling. It respects probabilistic node-relation equivariance and is a universal approximator for isomorphism-invariant link-level functions over KGs. This enables perfect performance on the Petals diagnostic dataset where prior models fail and state-of-the-art results on entity and relation prediction across 54 KGs from diverse domains.

What carries the argument

Probabilistic node-relation equivariance realized by sampling random walks, encoding them as sequences, embedding with a sequence model, and applying learned pooling to produce node and relation representations.

If this is right

Flock distinguishes relations that are structurally similar but semantically distinct, unlike models limited by deterministic equivariance.
The model generalizes structural properties to novel knowledge graphs containing unseen entities and relations.
Flock achieves perfect accuracy on the Petals diagnostic dataset.
The approach reaches state-of-the-art results on both entity and relation prediction tasks across 54 knowledge graphs from diverse domains.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The random-walk sampling strategy could scale equivariant modeling to larger graphs where full traversal is impractical.
Replacing the sequence model with more recent architectures might further improve capture of long-range structural patterns.
The same probabilistic symmetry-breaking idea could be tested on other structured prediction tasks that require both invariance and fine distinction.

Load-bearing premise

That sampling random walks and applying learned pooling on sequence embeddings can implement probabilistic node-relation equivariance in a way that distinguishes semantically distinct relations while preserving generalization to novel KGs.

What would settle it

A run in which Flock fails to reach perfect accuracy on the Petals dataset or does not outperform prior models on entity and relation prediction across the 54 KGs would indicate that the probabilistic equivariance mechanism does not deliver the claimed distinction or generalization.

read the original abstract

We study the problem of zero-shot link prediction on knowledge graphs (KGs), which requires models to generalize to novel entities and novel relations. Knowledge graph foundation models (KGFMs) address this task by enforcing equivariance over both nodes and relations, which enables them to learn structural properties of nodes and relations that transfer to novel KGs with similar structure. However, the conventional notion of deterministic equivariance inherently limits the expressive power of KGFMs, as it prevents them from distinguishing relations that are structurally similar but semantically distinct. To overcome this limitation, we propose to leverage probabilistic node-relation equivariance, which preserves equivariance in distribution while using structured randomness to break symmetries at inference time. Building on this principle, we present Flock, a KGFM that iteratively samples random walks, encodes them into sequences, embeds them with a sequence model, and aggregates node and relation representations through learned pooling. Flock respects probabilistic node-relation equivariance and, crucially, is a universal approximator for isomorphism-invariant link-level functions over KGs. Empirically, Flock perfectly solves our new diagnostic dataset Petals on which current KGFMs fail, and achieves state-of-the-art performance on entity and relation prediction tasks across 54 KGs from diverse domains. Code is available at https://github.com/jw9730/flock.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The manuscript introduces Flock, a knowledge graph foundation model for zero-shot link prediction that iteratively samples random walks on KGs, encodes them as sequences, embeds the sequences with a sequence model, and aggregates node/relation representations via learned pooling. It claims that this architecture respects probabilistic node-relation equivariance (distributional invariance under node/relation isomorphisms with structured randomness to break symmetries) and is a universal approximator for isomorphism-invariant link-level functions. Empirically, Flock is reported to perfectly solve the new Petals diagnostic dataset (where prior KGFMs fail) and to achieve state-of-the-art results on entity and relation prediction across 54 KGs from diverse domains, with code released.

Significance. If the central claims on probabilistic equivariance and universal approximation hold, the work would meaningfully advance KGFMs by relaxing the expressivity limits of deterministic equivariance while retaining transfer to novel KGs. The introduction of structured randomness to distinguish structurally similar but semantically distinct relations is a conceptually clean idea. Credit is due for the new Petals diagnostic set, the broad empirical evaluation across 54 KGs, and the public code release, all of which facilitate reproducibility and further testing.

major comments (2)

[§3.2] §3.2 (Architecture and probabilistic equivariance): The argument that learned pooling over sequence embeddings from random walks realizes probabilistic node-relation equivariance is not supported by an explicit architectural constraint, symmetry-enforcing loss term, or formal invariance proof. A generic attention or MLP pooling function can in principle produce aggregations whose output distribution is not invariant under node/relation isomorphisms, undermining the central claim that the model respects the required distributional property.
[§3.4, Theorem 1] §3.4, Theorem 1 (Universal approximation): The proof that Flock is a universal approximator for isomorphism-invariant link-level functions relies on the expressivity of random-walk sequences plus the sequence model and pooling, but does not address how finite sampling and the specific learned pooling guarantee approximation of all such functions (including those that must distinguish semantically distinct relations). This is load-bearing for the theoretical contribution and requires a more detailed construction or reduction.

minor comments (1)

[Table 1, Figure 3] Table 1 and Figure 3: axis labels and legend entries for the Petals results could be expanded to explicitly note the probabilistic vs. deterministic equivariance distinction being tested.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful reading and constructive feedback. The two major comments identify areas where the current presentation of probabilistic equivariance and the universal-approximation argument would benefit from greater formality. We address each point below and will incorporate the requested clarifications and proof details in the revised manuscript.

read point-by-point responses

Referee: [§3.2] §3.2 (Architecture and probabilistic equivariance): The argument that learned pooling over sequence embeddings from random walks realizes probabilistic node-relation equivariance is not supported by an explicit architectural constraint, symmetry-enforcing loss term, or formal invariance proof. A generic attention or MLP pooling function can in principle produce aggregations whose output distribution is not invariant under node/relation isomorphisms, undermining the central claim that the model respects the required distributional property.

Authors: We agree that the manuscript currently motivates probabilistic equivariance primarily through the architectural choices (structured random-walk sampling followed by sequence encoding and learned pooling) without a self-contained invariance argument. While the sampling procedure is designed to produce an output distribution that is invariant under node/relation isomorphisms, we acknowledge that a generic pooling function could in principle violate this property. In the revision we will add a concise formal argument in §3.2 showing that the combination of the random-walk measure and the permutation-equivariant sequence model yields a distribution over embeddings that is invariant under the relevant isomorphisms, with the learned pooling acting only on already-invariant statistics. revision: yes
Referee: [§3.4, Theorem 1] §3.4, Theorem 1 (Universal approximation): The proof that Flock is a universal approximator for isomorphism-invariant link-level functions relies on the expressivity of random-walk sequences plus the sequence model and pooling, but does not address how finite sampling and the specific learned pooling guarantee approximation of all such functions (including those that must distinguish semantically distinct relations). This is load-bearing for the theoretical contribution and requires a more detailed construction or reduction.

Authors: We accept that the existing sketch of Theorem 1 leaves open the question of finite sampling and the precise role of the learned pooling operator. The high-level argument relies on the fact that sufficiently long random walks can encode all isomorphism-invariant information and that the sequence model is universal for sequence functions, yet a rigorous reduction showing that finite samples plus a fixed pooling architecture can still approximate any continuous isomorphism-invariant link function is missing. In the revised version we will expand the proof to include an explicit construction: first, a finite-sample concentration argument that controls the deviation from the infinite-walk limit, and second, a reduction showing that the learned pooling can realize any continuous symmetric function on the multiset of walk embeddings, thereby recovering the required distinctions among semantically distinct but structurally similar relations. revision: yes

Circularity Check

0 steps flagged

No significant circularity in derivation chain

full rationale

The paper's claims of respecting probabilistic node-relation equivariance and being a universal approximator follow directly from the described architecture of iterative random walk sampling, sequence encoding, embedding via a sequence model, and learned pooling. These steps are composed from standard, independently verifiable components without any reduction of the central results to fitted parameters renamed as predictions, self-definitional loops, or load-bearing self-citations that substitute for external justification. The probabilistic aspect arises from the structured randomness in walks, and the approximation property is positioned as a consequence of the expressiveness of the sequence model plus pooling, both of which remain falsifiable via the reported empirical results on Petals and the 54 KGs. No steps match the enumerated circularity patterns.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the new notion of probabilistic node-relation equivariance and the assumption that random-walk sampling plus sequence embedding plus learned pooling realizes it while remaining a universal approximator.

axioms (1)

domain assumption Probabilistic node-relation equivariance preserves equivariance in distribution while allowing symmetry breaking via structured randomness
Invoked to overcome the expressivity limit of deterministic equivariance.

pith-pipeline@v0.9.0 · 5799 in / 1130 out tokens · 43579 ms · 2026-05-18T10:11:50.681848+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/AlexanderDuality.lean alexander_duality_circle_linking unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Proposition 4.2. Suppose that the walk sampling protocol η is invariant in probability and both the recording protocol w and the consensus protocol c are invariant. Then ... FLOCK model is invariant in probability.

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.