RankGraph-2: Lifecycle Co-Design for Billion-Node Graph Learning in Recommendation

Haomin Yu; Hong Li; Hong Yan; Junjie Yang; Ke Pan; Li Yu; Mahesh Srinivasan; Nipun Mathur; Renzhi Wu; Sri Reddy

arxiv: 2606.18379 · v2 · pith:SS57AOFGnew · submitted 2026-06-16 · 💻 cs.IR · cs.AI

RankGraph-2: Lifecycle Co-Design for Billion-Node Graph Learning in Recommendation

Renzhi Wu , Zikun Cui , Junjie Yang , Tai Guo , Hong Li , Xian Chen , Li Yu , Ke Pan

show 5 more authors

Sri Reddy Mahesh Srinivasan Nipun Mathur Haomin Yu Hong Yan

This is my paper

Pith reviewed 2026-06-30 10:40 UTC · model grok-4.3

classification 💻 cs.IR cs.AI

keywords graph-based retrievalrecommendation systemsbillion-node graphslifecycle co-designsimilarity retrievalcluster indexpersonalized PageRanksubsampling

0 comments

The pith

Co-design of graph construction, learning, and serving enables high-recall retrieval on billion-node recommendation graphs.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper claims that jointly designing graph construction, representation learning, and real-time serving solves the coupled problems of billion-node graph retrieval for recommendations, where each stage shapes the others. Serving needs a co-learned cluster index to replace online KNN, so training must incorporate index optimization and construction must produce self-contained pre-computed data. Construction therefore applies subsampling with popularity bias correction to shrink edges from hundreds of trillions to hundreds of billions and uses personalized PageRank to pre-compute multi-hop neighborhoods, supporting hour-level refreshes. This yields an 83 percent reduction in serving cost plus 3.8 times higher recall than a GAT plus Deep Graph Infomax baseline on bipartite graphs.

Core claim

RankGraph-2 shows that the requirements of each lifecycle stage determine the others: serving requires a pre-learned cluster index, training must therefore co-optimize that index, and construction must therefore output self-contained data that supports offline neighborhood computation via subsampling and personalized PageRank, allowing a simple architecture to support U2U2I and U2I2I similarity retrieval at scale.

What carries the argument

The lifecycle co-design that propagates serving constraints backward, implemented as subsampled neighborhoods with popularity bias correction, personalized PageRank pre-computation, and residual-quantization cluster index co-training.

If this is right

Subsampling with popularity bias correction reduces hundreds of trillions of edges to hundreds of billions while supporting the retrieval task.
Co-learning the residual-quantization cluster index reduces serving computational cost by 83 percent.
The resulting system achieves 3.8 times higher recall than a GAT plus Deep Graph Infomax model on bipartite graphs.
The system achieves 2.1 times higher recall than PyTorch-BigGraph on item retrieval.
Production deployment produces up to 0.96 percent CTR and 2.75 percent CVR gains and supports 20 or more retrieval launches.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The tolerance for pre-computed static neighborhoods may extend the same co-design pattern to other large-scale embedding retrieval problems outside recommendation graphs.
The popularity bias correction step could require re-tuning when applied to graphs whose degree distributions differ markedly from the recommendation setting.
Hour-level refresh capability opens the possibility of testing the construction pipeline on catalogs with faster item turnover than typical recommendation surfaces.

Load-bearing premise

Similarity-based retrieval tolerates static pre-computed neighborhoods from subsampled graphs rather than requiring dynamic online graph infrastructure, and the subsampling preserves the information needed for the downstream task.

What would settle it

A side-by-side recall measurement on the same billion-node bipartite graph when retrieval uses the paper's pre-computed neighborhoods versus full online KNN or graph queries.

read the original abstract

Graph-based retrieval at billion-node scale requires jointly solving three tightly coupled problems -- graph construction, representation learning, and real-time serving -- yet existing work addresses each in isolation. We present RankGraph-2, a framework deployed at Meta that co-designs all three lifecycle stages for similarity-based retrieval (U2U2I and U2I2I), where each stage's requirements shape the others. Serving requires a co-learned cluster index to avoid expensive online KNN -- this pushes index co-training into the training objective. Training benefits from the observation that similarity-based retrieval tolerates pre-computed neighborhoods, eliminating online graph infrastructure -- this requires construction to produce self-contained data. Construction must also support hour-level refresh for item coverage. Acting on these cascading requirements, RankGraph-2 reduces hundreds of trillions of edges to hundreds of billions via subsampling with popularity bias correction, pre-computes multi-hop neighborhoods via personalized PageRank, and co-learns a residual-quantization cluster index that reduces serving computational cost by 83%. This lifecycle co-design enables a simple architecture to achieve 3.8 x higher recall than a GAT + Deep Graph Infomax model on a bipartite graph and 2.1 x higher than PyTorch-BigGraph on item retrieval. RankGraph-2 delivers up to +0.96% CTR and +2.75% CVR, and has powered 20+ retrieval launches across major surfaces.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

RankGraph-2 is a deployed Meta system that co-designs graph construction, training, and serving for billion-node recommendation retrieval and reports concrete recall and CTR lifts from that joint approach.

read the letter

The main point is that this paper describes a production system at Meta that treats graph construction, representation learning, and serving as one loop rather than separate problems. They pre-compute multi-hop neighborhoods with personalized PageRank, cut the edge count from trillions to billions with popularity-biased subsampling, and co-train a residual-quantization index so that serving stays cheap without online KNN. The result is a simpler model that beats GAT+DGI by 3.8x recall and PyTorch-BigGraph by 2.1x on item retrieval, plus measured gains of up to 0.96% CTR and 2.75% CVR across 20+ launches.

What the work does well is show how serving constraints actually change the earlier stages: the need for a static index forces the construction and training choices. The numbers come from real surfaces rather than toy graphs, which gives the claims some weight.

The soft spots are mostly about missing detail. The abstract gives the high-level story but not the exact training objective, the ablation numbers that isolate each change, or the full set of baselines. It is not obvious how much the popularity-bias correction preserves the signal needed downstream, and the claim that similarity retrieval tolerates pre-computed neighborhoods would be stronger with more discussion of its limits. These are normal gaps for an industry paper rather than fatal ones.

This is for people who build or study large-scale graph retrieval in recommendations. It is grounded enough in deployment results to deserve a serious referee, even if the methods section will need expansion.

Referee Report

0 major / 2 minor

Summary. The manuscript presents RankGraph-2, a deployed framework at Meta for co-designing graph construction, representation learning, and serving in billion-node similarity-based retrieval (U2U2I and U2I2I) for recommendations. Serving constraints drive co-training of a residual-quantization cluster index; this in turn requires construction to produce static, self-contained multi-hop neighborhoods via personalized PageRank after aggressive subsampling with popularity-bias correction. The resulting system is reported to deliver 3.8 imes recall over GAT+Deep Graph Infomax on bipartite graphs and 2.1 imes over PyTorch-BigGraph on item retrieval, together with up to +0.96% CTR and +2.75% CVR lifts, and has supported more than 20 production launches.

Significance. If the empirical claims are substantiated by the full experimental sections, the work is significant for large-scale industrial graph retrieval. It supplies concrete evidence that joint optimization across the lifecycle can eliminate online graph infrastructure while preserving retrieval quality, and the production deployment record constitutes a strong real-world validation of the co-design approach.

minor comments (2)

[Abstract] Abstract: the retrieval tasks U2U2I and U2I2I are referenced without a one-sentence definition; adding this would aid readers outside the immediate sub-area.
[Abstract] Abstract: the 83% serving-cost reduction is stated as a direct outcome of the co-learned index; a brief parenthetical on the metric (latency, FLOPs, or memory) would improve precision.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive assessment of RankGraph-2 and the recommendation for minor revision. The provided summary accurately reflects the paper's focus on lifecycle co-design for billion-node graph retrieval and the reported production outcomes. No major comments appear in the report.

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper presents an empirical systems framework whose central claims are performance gains measured on deployed retrieval tasks. No equations, fitted parameters, or uniqueness theorems are supplied in the abstract or described text that reduce a reported outcome to an input by construction. The co-design argument is framed as a set of engineering constraints (serving cost, refresh latency, neighborhood pre-computation) that motivate implementation choices, with results presented as measured consequences rather than algebraic identities. Self-citations, if present, are not invoked as load-bearing uniqueness results. The derivation chain is therefore self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only; no explicit free parameters, axioms, or invented entities are stated beyond high-level design choices.

pith-pipeline@v0.9.1-grok · 5831 in / 1073 out tokens · 33089 ms · 2026-06-30T10:40:41.137182+00:00 · methodology

Review history (2 revisions) →

discussion (0)

Reference graph

Works this paper leans on

19 extracted references · 10 canonical work pages · 5 internal anchors

[1]

Spectral Networks and Locally Connected Networks on Graphs

Joan Bruna, Wojciech Zaremba, Arthur Szlam, and Yann LeCun. Spectral networks and locally connected net- works on graphs.arXiv preprint arXiv:1312.6203,

work page internal anchor Pith review Pith/arXiv arXiv
[2]

The Faiss library

Matthijs Douze, Alexandr Guzhva, Chengqi Deng, Jeff Johnson, Gergely Szilvasy, Pierre-Emmanuel Mazaré, Maria Lomeli, Lucas Hosseini, and Hervé Jégou. The faiss library.arXiv preprint arXiv:2401.08281,

work page internal anchor Pith review Pith/arXiv arXiv
[3]

Pyg 2.0: Scalable learning on real world graphs.arXiv preprint arXiv:2507.16991,

Matthias Fey, Jinu Sunil, Akihiro Nitta, Rishi Puri, Manan Shah, Blaž Stojanovič, Ramona Bendias, Alexandria Barghi, Vid Kocijan, Zecheng Zhang, et al. Pyg 2.0: Scalable learning on real world graphs.arXiv preprint arXiv:2507.16991,

work page arXiv
[4]

Heterogeneous graph transformer

Ziniu Hu, Yuxiao Dong, Kuansan Wang, and Yizhou Sun. Heterogeneous graph transformer. InProceedings of the web conference 2020, pages 2704–2710,

2020
[5]

Breaking the hour- glass phenomenon of residual quantization: Enhancing the upper bound of generative retrieval.arXiv preprint arXiv:2407.21488,

Zhirui Kuai, Zuxu Chen, Huimu Wang, Mingming Li, Dadong Miao, Binbin Wang, Xusong Chen, Li Kuang, Yuxing Han, Jiaxing Wang, et al. Breaking the hour- glass phenomenon of residual quantization: Enhancing the upper bound of generative retrieval.arXiv preprint arXiv:2407.21488,

work page arXiv
[6]

Macgnn: Graph neural net- works for billion-scale recommendation systems.arXiv preprint arXiv:2401.14106,

Hao Liu, Jianfei Qian, et al. Macgnn: Graph neural net- works for billion-scale recommendation systems.arXiv preprint arXiv:2401.14106,

work page arXiv
[7]

Justify- ing recommendations using distantly-labeled reviews and fine-grained aspects

Jianmo Ni, Jiacheng Li, and Julian McAuley. Justify- ing recommendations using distantly-labeled reviews and fine-grained aspects. InProceedings of the 2019 conference on empirical methods in natural language processing, pages 188–197,

2019
[8]

Representation Learning with Contrastive Predictive Coding

Aaron van den Oord, Yazhe Li, and Oriol Vinyals. Rep- resentation learning with contrastive predictive coding. arXiv preprint arXiv:1807.03748,

work page internal anchor Pith review Pith/arXiv arXiv
[9]

Infonce loss provably learns cluster-preserving representations

Advait Parulekar, Liam Collins, Karthikeyan Shanmugam, Aryan Mokhtari, and Sanjay Shakkottai. Infonce loss provably learns cluster-preserving representations. In The Thirty Sixth Annual Conference on Learning The- ory, pages 1914–1961. PMLR,

1914
[10]

Milvus: A purpose- built vector data management system

Jianguo Wang, Xiaomeng Yi, Rentong Guo, Hai Jin, Peng Xu, Shengjun Li, Xiangyu Wang, Xiangzhou Guo, Chengming Li, Xiaohai Xu, et al. Milvus: A purpose- built vector data management system. InProceedings of the 2021 international conference on management of data, pages 2614–2627,

2021
[11]

Onerec: Unifying retrieve and rank with generative recommender and iterative preference alignment.arXiv preprint arXiv:2410.19312,

Jiaxin Wang et al. Onerec: Unifying retrieve and rank with generative recommender and iterative preference alignment.arXiv preprint arXiv:2410.19312,

work page arXiv
[12]

Heterogeneous graph attention network

Xiao Wang, Houye Ji, Chuan Shi, Bai Wang, Yanfang Ye, Peng Cui, and Philip S Yu. Heterogeneous graph attention network. InThe world wide web conference, pages 2022–2032,

2022
[13]

Rankgraph: Unified heterogeneous graph learning for cross-domain recommendation, 2025.https: //arxiv.org/abs/2509.02942

Renzhi Wu, Junjie Yang, Li Chen, Hong Li, Li Yu, and Hong Yan. Rankgraph: Unified heterogeneous graph learning for cross-domain recommendation, 2025.https: //arxiv.org/abs/2509.02942. Shiwen Wu, Fei Sun, Wentao Zhang, Xu Xie, and Bin Cui. Graph neural networks in recommender systems: a survey.ACM Computing Surveys, 55(5):1–37,

work page arXiv 2025
[14]

Large-scale personalized video game recommendation via social-aware contextualized graph neural network

Liangwei Yang, Zhiwei Liu, Yu Wang, Chen Wang, Ziwei Fan, and Philip S Yu. Large-scale personalized video game recommendation via social-aware contextualized graph neural network. InProceedings of the ACM Web Conference 2022, pages 3376–3386,

2022
[15]

Actions Speak Louder than Words: Trillion-Parameter Sequential Transducers for Generative Recommendations

Jiaqi Zhai, Lucy Liao, Xing Liu, Yueming Wang, Rui Li, Xuan Cao, Leon Gao, Zhaojie Gong, Fangda Gu, Michael He, et al. Actions speak louder than words: Trillion-parameter sequential transducers for generative recommendations.arXiv preprint arXiv:2402.17152,

work page internal anchor Pith review Pith/arXiv arXiv
[16]

Divide and conquer: Towards better embedding-based retrieval for recommender systems from a multi-task perspective

12 Yuan Zhang, Xue Dong, Weijie Ding, Biao Li, Peng Jiang, and Kun Gai. Divide and conquer: Towards better embedding-based retrieval for recommender systems from a multi-task perspective. InCompanion Proceed- ings of the ACM Web Conference 2023, pages 366–370,

2023
[17]

Gigl: Large-scale graph neural networks at snapchat

Tong Zhao, Yozen Li, Ankit Sharma, Vassilis N Ioanni- dis, et al. Gigl: Large-scale graph neural networks at snapchat. InProceedings of the ACM Web Conference 2025,

2025
[18]

Pinfm: Foundation model for user activity sequences at a billion-scale visual discovery platform

Haoyu Zhou et al. Pinfm: Foundation model for user activity sequences at a billion-scale visual discovery platform. InProceedings of the ACM Web Conference 2024,

2024
[19]

AliGraph: A Comprehensive Graph Neural Network Platform

Rong Zhu, Kun Zhao, Hongxia Yang, Wei Lin, Chang Zhou, Baole Ai, Yong Li, and Jingren Zhou. Aligraph: A comprehensive graph neural network platform.arXiv preprint arXiv:1902.08730,

work page internal anchor Pith review Pith/arXiv arXiv 1902

[1] [1]

Spectral Networks and Locally Connected Networks on Graphs

Joan Bruna, Wojciech Zaremba, Arthur Szlam, and Yann LeCun. Spectral networks and locally connected net- works on graphs.arXiv preprint arXiv:1312.6203,

work page internal anchor Pith review Pith/arXiv arXiv

[2] [2]

The Faiss library

Matthijs Douze, Alexandr Guzhva, Chengqi Deng, Jeff Johnson, Gergely Szilvasy, Pierre-Emmanuel Mazaré, Maria Lomeli, Lucas Hosseini, and Hervé Jégou. The faiss library.arXiv preprint arXiv:2401.08281,

work page internal anchor Pith review Pith/arXiv arXiv

[3] [3]

Pyg 2.0: Scalable learning on real world graphs.arXiv preprint arXiv:2507.16991,

Matthias Fey, Jinu Sunil, Akihiro Nitta, Rishi Puri, Manan Shah, Blaž Stojanovič, Ramona Bendias, Alexandria Barghi, Vid Kocijan, Zecheng Zhang, et al. Pyg 2.0: Scalable learning on real world graphs.arXiv preprint arXiv:2507.16991,

work page arXiv

[4] [4]

Heterogeneous graph transformer

Ziniu Hu, Yuxiao Dong, Kuansan Wang, and Yizhou Sun. Heterogeneous graph transformer. InProceedings of the web conference 2020, pages 2704–2710,

2020

[5] [5]

Breaking the hour- glass phenomenon of residual quantization: Enhancing the upper bound of generative retrieval.arXiv preprint arXiv:2407.21488,

Zhirui Kuai, Zuxu Chen, Huimu Wang, Mingming Li, Dadong Miao, Binbin Wang, Xusong Chen, Li Kuang, Yuxing Han, Jiaxing Wang, et al. Breaking the hour- glass phenomenon of residual quantization: Enhancing the upper bound of generative retrieval.arXiv preprint arXiv:2407.21488,

work page arXiv

[6] [6]

Macgnn: Graph neural net- works for billion-scale recommendation systems.arXiv preprint arXiv:2401.14106,

Hao Liu, Jianfei Qian, et al. Macgnn: Graph neural net- works for billion-scale recommendation systems.arXiv preprint arXiv:2401.14106,

work page arXiv

[7] [7]

Justify- ing recommendations using distantly-labeled reviews and fine-grained aspects

Jianmo Ni, Jiacheng Li, and Julian McAuley. Justify- ing recommendations using distantly-labeled reviews and fine-grained aspects. InProceedings of the 2019 conference on empirical methods in natural language processing, pages 188–197,

2019

[8] [8]

Representation Learning with Contrastive Predictive Coding

Aaron van den Oord, Yazhe Li, and Oriol Vinyals. Rep- resentation learning with contrastive predictive coding. arXiv preprint arXiv:1807.03748,

work page internal anchor Pith review Pith/arXiv arXiv

[9] [9]

Infonce loss provably learns cluster-preserving representations

Advait Parulekar, Liam Collins, Karthikeyan Shanmugam, Aryan Mokhtari, and Sanjay Shakkottai. Infonce loss provably learns cluster-preserving representations. In The Thirty Sixth Annual Conference on Learning The- ory, pages 1914–1961. PMLR,

1914

[10] [10]

Milvus: A purpose- built vector data management system

Jianguo Wang, Xiaomeng Yi, Rentong Guo, Hai Jin, Peng Xu, Shengjun Li, Xiangyu Wang, Xiangzhou Guo, Chengming Li, Xiaohai Xu, et al. Milvus: A purpose- built vector data management system. InProceedings of the 2021 international conference on management of data, pages 2614–2627,

2021

[11] [11]

Onerec: Unifying retrieve and rank with generative recommender and iterative preference alignment.arXiv preprint arXiv:2410.19312,

Jiaxin Wang et al. Onerec: Unifying retrieve and rank with generative recommender and iterative preference alignment.arXiv preprint arXiv:2410.19312,

work page arXiv

[12] [12]

Heterogeneous graph attention network

Xiao Wang, Houye Ji, Chuan Shi, Bai Wang, Yanfang Ye, Peng Cui, and Philip S Yu. Heterogeneous graph attention network. InThe world wide web conference, pages 2022–2032,

2022

[13] [13]

Rankgraph: Unified heterogeneous graph learning for cross-domain recommendation, 2025.https: //arxiv.org/abs/2509.02942

Renzhi Wu, Junjie Yang, Li Chen, Hong Li, Li Yu, and Hong Yan. Rankgraph: Unified heterogeneous graph learning for cross-domain recommendation, 2025.https: //arxiv.org/abs/2509.02942. Shiwen Wu, Fei Sun, Wentao Zhang, Xu Xie, and Bin Cui. Graph neural networks in recommender systems: a survey.ACM Computing Surveys, 55(5):1–37,

work page arXiv 2025

[14] [14]

Large-scale personalized video game recommendation via social-aware contextualized graph neural network

Liangwei Yang, Zhiwei Liu, Yu Wang, Chen Wang, Ziwei Fan, and Philip S Yu. Large-scale personalized video game recommendation via social-aware contextualized graph neural network. InProceedings of the ACM Web Conference 2022, pages 3376–3386,

2022

[15] [15]

Actions Speak Louder than Words: Trillion-Parameter Sequential Transducers for Generative Recommendations

Jiaqi Zhai, Lucy Liao, Xing Liu, Yueming Wang, Rui Li, Xuan Cao, Leon Gao, Zhaojie Gong, Fangda Gu, Michael He, et al. Actions speak louder than words: Trillion-parameter sequential transducers for generative recommendations.arXiv preprint arXiv:2402.17152,

work page internal anchor Pith review Pith/arXiv arXiv

[16] [16]

Divide and conquer: Towards better embedding-based retrieval for recommender systems from a multi-task perspective

12 Yuan Zhang, Xue Dong, Weijie Ding, Biao Li, Peng Jiang, and Kun Gai. Divide and conquer: Towards better embedding-based retrieval for recommender systems from a multi-task perspective. InCompanion Proceed- ings of the ACM Web Conference 2023, pages 366–370,

2023

[17] [17]

Gigl: Large-scale graph neural networks at snapchat

Tong Zhao, Yozen Li, Ankit Sharma, Vassilis N Ioanni- dis, et al. Gigl: Large-scale graph neural networks at snapchat. InProceedings of the ACM Web Conference 2025,

2025

[18] [18]

Pinfm: Foundation model for user activity sequences at a billion-scale visual discovery platform

Haoyu Zhou et al. Pinfm: Foundation model for user activity sequences at a billion-scale visual discovery platform. InProceedings of the ACM Web Conference 2024,

2024

[19] [19]

AliGraph: A Comprehensive Graph Neural Network Platform

Rong Zhu, Kun Zhao, Hongxia Yang, Wei Lin, Chang Zhou, Baole Ai, Yong Li, and Jingren Zhou. Aligraph: A comprehensive graph neural network platform.arXiv preprint arXiv:1902.08730,

work page internal anchor Pith review Pith/arXiv arXiv 1902