pith. sign in

arxiv: 2606.18379 · v2 · pith:SS57AOFGnew · submitted 2026-06-16 · 💻 cs.IR · cs.AI

RankGraph-2: Lifecycle Co-Design for Billion-Node Graph Learning in Recommendation

Pith reviewed 2026-06-30 10:40 UTC · model grok-4.3

classification 💻 cs.IR cs.AI
keywords graph-based retrievalrecommendation systemsbillion-node graphslifecycle co-designsimilarity retrievalcluster indexpersonalized PageRanksubsampling
0
0 comments X

The pith

Co-design of graph construction, learning, and serving enables high-recall retrieval on billion-node recommendation graphs.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper claims that jointly designing graph construction, representation learning, and real-time serving solves the coupled problems of billion-node graph retrieval for recommendations, where each stage shapes the others. Serving needs a co-learned cluster index to replace online KNN, so training must incorporate index optimization and construction must produce self-contained pre-computed data. Construction therefore applies subsampling with popularity bias correction to shrink edges from hundreds of trillions to hundreds of billions and uses personalized PageRank to pre-compute multi-hop neighborhoods, supporting hour-level refreshes. This yields an 83 percent reduction in serving cost plus 3.8 times higher recall than a GAT plus Deep Graph Infomax baseline on bipartite graphs.

Core claim

RankGraph-2 shows that the requirements of each lifecycle stage determine the others: serving requires a pre-learned cluster index, training must therefore co-optimize that index, and construction must therefore output self-contained data that supports offline neighborhood computation via subsampling and personalized PageRank, allowing a simple architecture to support U2U2I and U2I2I similarity retrieval at scale.

What carries the argument

The lifecycle co-design that propagates serving constraints backward, implemented as subsampled neighborhoods with popularity bias correction, personalized PageRank pre-computation, and residual-quantization cluster index co-training.

If this is right

  • Subsampling with popularity bias correction reduces hundreds of trillions of edges to hundreds of billions while supporting the retrieval task.
  • Co-learning the residual-quantization cluster index reduces serving computational cost by 83 percent.
  • The resulting system achieves 3.8 times higher recall than a GAT plus Deep Graph Infomax model on bipartite graphs.
  • The system achieves 2.1 times higher recall than PyTorch-BigGraph on item retrieval.
  • Production deployment produces up to 0.96 percent CTR and 2.75 percent CVR gains and supports 20 or more retrieval launches.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The tolerance for pre-computed static neighborhoods may extend the same co-design pattern to other large-scale embedding retrieval problems outside recommendation graphs.
  • The popularity bias correction step could require re-tuning when applied to graphs whose degree distributions differ markedly from the recommendation setting.
  • Hour-level refresh capability opens the possibility of testing the construction pipeline on catalogs with faster item turnover than typical recommendation surfaces.

Load-bearing premise

Similarity-based retrieval tolerates static pre-computed neighborhoods from subsampled graphs rather than requiring dynamic online graph infrastructure, and the subsampling preserves the information needed for the downstream task.

What would settle it

A side-by-side recall measurement on the same billion-node bipartite graph when retrieval uses the paper's pre-computed neighborhoods versus full online KNN or graph queries.

read the original abstract

Graph-based retrieval at billion-node scale requires jointly solving three tightly coupled problems -- graph construction, representation learning, and real-time serving -- yet existing work addresses each in isolation. We present RankGraph-2, a framework deployed at Meta that co-designs all three lifecycle stages for similarity-based retrieval (U2U2I and U2I2I), where each stage's requirements shape the others. Serving requires a co-learned cluster index to avoid expensive online KNN -- this pushes index co-training into the training objective. Training benefits from the observation that similarity-based retrieval tolerates pre-computed neighborhoods, eliminating online graph infrastructure -- this requires construction to produce self-contained data. Construction must also support hour-level refresh for item coverage. Acting on these cascading requirements, RankGraph-2 reduces hundreds of trillions of edges to hundreds of billions via subsampling with popularity bias correction, pre-computes multi-hop neighborhoods via personalized PageRank, and co-learns a residual-quantization cluster index that reduces serving computational cost by 83%. This lifecycle co-design enables a simple architecture to achieve 3.8 x higher recall than a GAT + Deep Graph Infomax model on a bipartite graph and 2.1 x higher than PyTorch-BigGraph on item retrieval. RankGraph-2 delivers up to +0.96% CTR and +2.75% CVR, and has powered 20+ retrieval launches across major surfaces.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 2 minor

Summary. The manuscript presents RankGraph-2, a deployed framework at Meta for co-designing graph construction, representation learning, and serving in billion-node similarity-based retrieval (U2U2I and U2I2I) for recommendations. Serving constraints drive co-training of a residual-quantization cluster index; this in turn requires construction to produce static, self-contained multi-hop neighborhoods via personalized PageRank after aggressive subsampling with popularity-bias correction. The resulting system is reported to deliver 3.8 imes recall over GAT+Deep Graph Infomax on bipartite graphs and 2.1 imes over PyTorch-BigGraph on item retrieval, together with up to +0.96% CTR and +2.75% CVR lifts, and has supported more than 20 production launches.

Significance. If the empirical claims are substantiated by the full experimental sections, the work is significant for large-scale industrial graph retrieval. It supplies concrete evidence that joint optimization across the lifecycle can eliminate online graph infrastructure while preserving retrieval quality, and the production deployment record constitutes a strong real-world validation of the co-design approach.

minor comments (2)
  1. [Abstract] Abstract: the retrieval tasks U2U2I and U2I2I are referenced without a one-sentence definition; adding this would aid readers outside the immediate sub-area.
  2. [Abstract] Abstract: the 83% serving-cost reduction is stated as a direct outcome of the co-learned index; a brief parenthetical on the metric (latency, FLOPs, or memory) would improve precision.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive assessment of RankGraph-2 and the recommendation for minor revision. The provided summary accurately reflects the paper's focus on lifecycle co-design for billion-node graph retrieval and the reported production outcomes. No major comments appear in the report.

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper presents an empirical systems framework whose central claims are performance gains measured on deployed retrieval tasks. No equations, fitted parameters, or uniqueness theorems are supplied in the abstract or described text that reduce a reported outcome to an input by construction. The co-design argument is framed as a set of engineering constraints (serving cost, refresh latency, neighborhood pre-computation) that motivate implementation choices, with results presented as measured consequences rather than algebraic identities. Self-citations, if present, are not invoked as load-bearing uniqueness results. The derivation chain is therefore self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only; no explicit free parameters, axioms, or invented entities are stated beyond high-level design choices.

pith-pipeline@v0.9.1-grok · 5831 in / 1073 out tokens · 33089 ms · 2026-06-30T10:40:41.137182+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

19 extracted references · 10 canonical work pages · 5 internal anchors

  1. [1]

    Spectral Networks and Locally Connected Networks on Graphs

    Joan Bruna, Wojciech Zaremba, Arthur Szlam, and Yann LeCun. Spectral networks and locally connected net- works on graphs.arXiv preprint arXiv:1312.6203,

  2. [2]

    The Faiss library

    Matthijs Douze, Alexandr Guzhva, Chengqi Deng, Jeff Johnson, Gergely Szilvasy, Pierre-Emmanuel Mazaré, Maria Lomeli, Lucas Hosseini, and Hervé Jégou. The faiss library.arXiv preprint arXiv:2401.08281,

  3. [3]

    Pyg 2.0: Scalable learning on real world graphs.arXiv preprint arXiv:2507.16991,

    Matthias Fey, Jinu Sunil, Akihiro Nitta, Rishi Puri, Manan Shah, Blaž Stojanovič, Ramona Bendias, Alexandria Barghi, Vid Kocijan, Zecheng Zhang, et al. Pyg 2.0: Scalable learning on real world graphs.arXiv preprint arXiv:2507.16991,

  4. [4]

    Heterogeneous graph transformer

    Ziniu Hu, Yuxiao Dong, Kuansan Wang, and Yizhou Sun. Heterogeneous graph transformer. InProceedings of the web conference 2020, pages 2704–2710,

  5. [5]

    Breaking the hour- glass phenomenon of residual quantization: Enhancing the upper bound of generative retrieval.arXiv preprint arXiv:2407.21488,

    Zhirui Kuai, Zuxu Chen, Huimu Wang, Mingming Li, Dadong Miao, Binbin Wang, Xusong Chen, Li Kuang, Yuxing Han, Jiaxing Wang, et al. Breaking the hour- glass phenomenon of residual quantization: Enhancing the upper bound of generative retrieval.arXiv preprint arXiv:2407.21488,

  6. [6]

    Macgnn: Graph neural net- works for billion-scale recommendation systems.arXiv preprint arXiv:2401.14106,

    Hao Liu, Jianfei Qian, et al. Macgnn: Graph neural net- works for billion-scale recommendation systems.arXiv preprint arXiv:2401.14106,

  7. [7]

    Justify- ing recommendations using distantly-labeled reviews and fine-grained aspects

    Jianmo Ni, Jiacheng Li, and Julian McAuley. Justify- ing recommendations using distantly-labeled reviews and fine-grained aspects. InProceedings of the 2019 conference on empirical methods in natural language processing, pages 188–197,

  8. [8]

    Representation Learning with Contrastive Predictive Coding

    Aaron van den Oord, Yazhe Li, and Oriol Vinyals. Rep- resentation learning with contrastive predictive coding. arXiv preprint arXiv:1807.03748,

  9. [9]

    Infonce loss provably learns cluster-preserving representations

    Advait Parulekar, Liam Collins, Karthikeyan Shanmugam, Aryan Mokhtari, and Sanjay Shakkottai. Infonce loss provably learns cluster-preserving representations. In The Thirty Sixth Annual Conference on Learning The- ory, pages 1914–1961. PMLR,

  10. [10]

    Milvus: A purpose- built vector data management system

    Jianguo Wang, Xiaomeng Yi, Rentong Guo, Hai Jin, Peng Xu, Shengjun Li, Xiangyu Wang, Xiangzhou Guo, Chengming Li, Xiaohai Xu, et al. Milvus: A purpose- built vector data management system. InProceedings of the 2021 international conference on management of data, pages 2614–2627,

  11. [11]

    Onerec: Unifying retrieve and rank with generative recommender and iterative preference alignment.arXiv preprint arXiv:2410.19312,

    Jiaxin Wang et al. Onerec: Unifying retrieve and rank with generative recommender and iterative preference alignment.arXiv preprint arXiv:2410.19312,

  12. [12]

    Heterogeneous graph attention network

    Xiao Wang, Houye Ji, Chuan Shi, Bai Wang, Yanfang Ye, Peng Cui, and Philip S Yu. Heterogeneous graph attention network. InThe world wide web conference, pages 2022–2032,

  13. [13]

    Rankgraph: Unified heterogeneous graph learning for cross-domain recommendation, 2025.https: //arxiv.org/abs/2509.02942

    Renzhi Wu, Junjie Yang, Li Chen, Hong Li, Li Yu, and Hong Yan. Rankgraph: Unified heterogeneous graph learning for cross-domain recommendation, 2025.https: //arxiv.org/abs/2509.02942. Shiwen Wu, Fei Sun, Wentao Zhang, Xu Xie, and Bin Cui. Graph neural networks in recommender systems: a survey.ACM Computing Surveys, 55(5):1–37,

  14. [14]

    Large-scale personalized video game recommendation via social-aware contextualized graph neural network

    Liangwei Yang, Zhiwei Liu, Yu Wang, Chen Wang, Ziwei Fan, and Philip S Yu. Large-scale personalized video game recommendation via social-aware contextualized graph neural network. InProceedings of the ACM Web Conference 2022, pages 3376–3386,

  15. [15]

    Actions Speak Louder than Words: Trillion-Parameter Sequential Transducers for Generative Recommendations

    Jiaqi Zhai, Lucy Liao, Xing Liu, Yueming Wang, Rui Li, Xuan Cao, Leon Gao, Zhaojie Gong, Fangda Gu, Michael He, et al. Actions speak louder than words: Trillion-parameter sequential transducers for generative recommendations.arXiv preprint arXiv:2402.17152,

  16. [16]

    Divide and conquer: Towards better embedding-based retrieval for recommender systems from a multi-task perspective

    12 Yuan Zhang, Xue Dong, Weijie Ding, Biao Li, Peng Jiang, and Kun Gai. Divide and conquer: Towards better embedding-based retrieval for recommender systems from a multi-task perspective. InCompanion Proceed- ings of the ACM Web Conference 2023, pages 366–370,

  17. [17]

    Gigl: Large-scale graph neural networks at snapchat

    Tong Zhao, Yozen Li, Ankit Sharma, Vassilis N Ioanni- dis, et al. Gigl: Large-scale graph neural networks at snapchat. InProceedings of the ACM Web Conference 2025,

  18. [18]

    Pinfm: Foundation model for user activity sequences at a billion-scale visual discovery platform

    Haoyu Zhou et al. Pinfm: Foundation model for user activity sequences at a billion-scale visual discovery platform. InProceedings of the ACM Web Conference 2024,

  19. [19]

    AliGraph: A Comprehensive Graph Neural Network Platform

    Rong Zhu, Kun Zhao, Hongxia Yang, Wei Lin, Chang Zhou, Baole Ai, Yong Li, and Jingren Zhou. Aligraph: A comprehensive graph neural network platform.arXiv preprint arXiv:1902.08730,