Estimating Random-Walk Probabilities in Directed Graphs
Pith reviewed 2026-05-22 18:55 UTC · model grok-4.3
The pith
Algorithms and matching lower bounds now let any discounted random-walk probability in a directed graph be estimated in near-linear time.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The paper claims that for every variant of the estimation problem and every allowed combination of out-neighbor or similar queries, there exist algorithms whose running time is O(m log n) in the worst case and that no algorithm can do substantially better than Omega(min(m, 1/delta)) queries in the worst case or Omega(1/delta) in the average case, where delta is the target probability value; these bounds are tight up to logarithmic factors and hold even when the probability can be as small as 1 over n to a polynomial power.
What carries the argument
The central mechanisms are new deterministic estimation procedures that achieve O(m log n) time independent of how small the probability is, together with query-complexity lower-bound constructions that force Omega(min(m, 1/delta)) work for the single-pair case.
If this is right
- Any probability value, no matter how small, can be estimated deterministically after a single near-linear-time pass over the edges.
- The single-pair lower bound of Omega(min(m, 1/delta)) matches the time of a simple Monte Carlo sampler and is therefore optimal up to logs.
- All previously open polynomial gaps for single-source, single-target, and single-node variants are now resolved in both worst-case and average-case regimes.
- The same tight bounds apply uniformly across every combination of allowed graph queries examined in the paper.
Where Pith is reading between the lines
- If the same query model is used in practice on web-scale graphs, the new upper bounds imply that a single preprocessing pass suffices for arbitrarily small scores instead of repeated power iterations.
- The lower-bound technique may extend directly to other termination or absorption probabilities on the same directed-graph query model.
- Removing the restriction to out-neighbor queries, for example by allowing random edge sampling, would be a natural next setting in which to test whether the linear-time upper bound can be improved.
Load-bearing premise
The claimed optimality assumes that the only operations allowed are the exact query models stated in each problem variant and that no additional graph structure or preprocessing is available.
What would settle it
A concrete falsifier would be a directed graph family and a probability value delta where every algorithm using only the allowed out-neighbor queries requires asymptotically more than O(m log n) time to output a constant-relative-error estimate with constant success probability.
Figures
read the original abstract
We study discounted random walks in directed graphs. In each step, the walk either terminates with a constant probability $\alpha$, or proceeds to a random out-neighbor. Our goal is to estimate the probability $\pi(s, t)$ that a discounted random walk starting from $s$ terminates at $t$. This probability is also known as the Personalized PageRank (PPR) score, which measures the relevance of $t$ to $s$, for instance, when $s$ and $t$ are web pages on the Internet. We aim to estimate $\pi(s, t)$ within a constant relative error with constant probability. A variety of algorithms have been developed for several problem variants, such as single-pair, single-source, single-target, and single-node estimation, under both worst-case and average-case settings, and for different combinations of allowed graph queries. However, in many important cases, there remain polynomial gaps between known upper and lower bounds. In this paper, we establish tight upper and lower bounds (up to logarithmic factors of $n$) for all problem variants and query combinations, closing all existing gaps in both the worst-case and average-case settings. Below we give some examples for the worst-case settings. As an upper-bound example, the classic power method estimates $\pi(s,t)$ if it is above a threshold $\delta$ in time $O(m\log(1/\delta))$ but $\pi(s,t)$ can be as small as $1/n^{\Theta(n)}$. For contrast, we propose algorithms that deterministically estimate arbitrarily small $\pi(s,t)$ in $O(m\log n)$ time. As a lower-bound example, we improve the lower bound for the single-pair problem from $\Omega(\min\{n,1/\delta\})$ to $\Omega(\min\{m,1/\delta\})$, which is optimal (up to logarithmic factors) since a simple Monte Carlo estimate takes $O(1/\delta)$ time.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper studies estimating the termination probability π(s,t) of a discounted random walk (with termination probability α) from s to t in a directed graph, equivalent to the Personalized PageRank score. It develops algorithms and lower bounds for single-pair, single-source, single-target, and single-node variants under different graph query models, in both worst-case and average-case settings, claiming to close all prior polynomial gaps and achieve tight bounds up to O(log n) factors.
Significance. If the central claims hold, the work is significant for resolving open complexity questions in random-walk estimation on graphs. The deterministic O(m log n) estimator for arbitrarily small π(s,t) and the improved Ω(min{m, 1/δ}) lower bound for single-pair estimation (matching the Monte Carlo upper bound up to logs) are notable strengths. The paper supplies explicit algorithmic constructions and matching lower-bound arguments across all listed variants and query combinations.
minor comments (3)
- [Abstract] Abstract: the statement that the new deterministic estimator runs in O(m log n) time for arbitrarily small π(s,t) should explicitly reference the query model (out-neighbor access) and confirm that no additional preprocessing is assumed.
- The lower-bound section should include a brief remark on whether the Ω(min{m, 1/δ}) construction extends unchanged to the average-case setting or requires a separate argument.
- Consider adding a summary table that lists, for each variant and query combination, the new upper and lower bounds (including the log n factors) to improve readability.
Simulated Author's Rebuttal
We thank the referee for the positive summary, significance assessment, and recommendation of minor revision. We are glad that the central claims and contributions, including the deterministic O(m log n) estimator and the improved single-pair lower bound, are viewed as notable strengths.
Circularity Check
No significant circularity detected
full rationale
The paper derives new upper-bound algorithms (e.g., deterministic O(m log n) estimators for arbitrarily small π(s,t)) and an improved lower bound (Ω(min{m,1/δ}) for single-pair estimation) directly from standard query-model analysis, Monte Carlo sampling costs, and information-theoretic arguments on graph traversal. These steps rely on explicit constructions and reductions that are independent of the target bounds; they do not reduce by the paper's own equations to fitted parameters, self-definitions, or load-bearing self-citations. The claimed tightness up to log n factors follows from matching the presented algorithms against the lower-bound constructions under the stated access models, without circular renaming or imported uniqueness theorems.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Directed graphs admit out-neighbor queries and the standard query models for each problem variant
Forward citations
Cited by 2 Pith papers
-
Instance-Optimality in PageRank Computation
An adaptive bidirectional algorithm for constant-relative-error PageRank estimation is instance-optimal up to polylog factors on bounded-degree directed graphs and on sparse graphs with polylog high-degree vertices.
-
Near-Optimality for Single-Source Personalized PageRank
Establishes near-optimal time bounds for SSPPR-A and SSPPR-R queries, with matching upper and lower bounds for SSPPR-R on graphs where m is Omega(n log squared n).
Reference graph
Works this paper leans on
-
[1]
Reid Andersen, Christian Borgs, Jennifer T. Chayes, John E. Hopcroft, Vahab S. Mirrokni, and Shang-Hua Teng. Local computation of pagerank contributions. In Anthony Bonato and Fan R. K. Chung, editors,Algorithms and Models for the Web-Graph, 5th International Workshop, WA W 2007, San Diego, CA, USA, December 11-12, 2007, Proceedings, volume 4863 ofLecture...
work page 2007
-
[2]
Reid Andersen, Christian Borgs, Jennifer T. Chayes, John E. Hopcroft, Vahab S. Mirrokni, and Shang-Hua Teng. Local computation of pagerank contributions.Internet Math., 5(1):23–45, 2008
work page 2008
-
[3]
Sivakumar, Yushi Jing, Jay Yagnik, Shankar Kumar, Deepak Ravichandran, and Mohamed Aly
Shumeet Baluja, Rohan Seth, D. Sivakumar, Yushi Jing, Jay Yagnik, Shankar Kumar, Deepak Ravichandran, and Mohamed Aly. Video suggestion and discovery for youtube: taking random walks through the view graph. In Jinpeng Huai, Robin Chen, Hsiao-Wuen Hon, Yunhao Liu, Wei-Ying Ma, Andrew Tomkins, and Xiaodong Zhang, editors,Proceedings of the 17th Internationa...
work page 2008
-
[4]
Local approximation of pagerank and reverse pagerank
Ziv Bar-Yossef and Li-Tal Mashiach. Local approximation of pagerank and reverse pagerank. In Sung-Hyon Myaeng, Douglas W. Oard, Fabrizio Sebastiani, Tat-Seng Chua, and Mun- Kew Leong, editors,Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2008, Singapore, July 20-24, 2008, page...
work page 2008
-
[5]
Sublinear algorithms for local graph cen- trality estimation
Marco Bressan, Enoch Peserico, and Luca Pretto. Sublinear algorithms for local graph cen- trality estimation. In Mikkel Thorup, editor,59th IEEE Annual Symposium on Foundations of Computer Science, FOCS 2018, Paris, France, October 7-9, 2018, pages 709–718. IEEE Computer Society, 2018
work page 2018
-
[6]
Sublinear algorithms for local graph- centrality estimation.SIAM J
Marco Bressan, Enoch Peserico, and Luca Pretto. Sublinear algorithms for local graph- centrality estimation.SIAM J. Comput., 52(4):968–1008, 2023. 33
work page 2023
-
[7]
The anatomy of a large-scale hypertextual web search engine
Sergey Brin and Lawrence Page. The anatomy of a large-scale hypertextual web search engine. Comput. Networks, 30(1-7):107–117, 1998
work page 1998
-
[8]
Local methods for estimating pagerank values
Yen-Yu Chen, Qingqing Gan, and Torsten Suel. Local methods for estimating pagerank values. In David A. Grossman, Luis Gravano, ChengXiang Zhai, Otthein Herzog, and David A. Evans, editors,Proceedings of the 2004 ACM CIKM International Conference on Information and Knowledge Management, Washington, DC, USA, November 8-13, 2004, pages 381–389. ACM, 2004
work page 2004
-
[9]
Michael B. Cohen, Jonathan A. Kelner, John Peebles, Richard Peng, Anup B. Rao, Aaron Sidford, and Adrian Vladu. Almost-linear-time algorithms for markov chains and new spectral primitives for directed graphs. In Hamed Hatami, Pierre McKenzie, and Valerie King, editors, Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing, STOC 2017, ...
work page 2017
-
[10]
Michael B. Cohen, Jonathan A. Kelner, John Peebles, Richard Peng, Aaron Sidford, and Adrian Vladu. Faster algorithms for computing the stationary distribution, simulating random walks, and more. In Irit Dinur, editor,IEEE 57th Annual Symposium on Foundations of Computer Science, FOCS 2016, 9-11 October 2016, Hyatt Regency, New Brunswick, New Jersey, USA, ...
work page 2016
-
[11]
Nadav Eiron, Kevin S. McCurley, and John A. Tomlin. Ranking the web frontier. In Stuart I. Feldman, Mike Uretsky, Marc Najork, and Craig E. Wills, editors,Proceedings of the 13th international conference on World Wide Web, WWW 2004, New York, NY, USA, May 17-20, 2004, pages 309–318. ACM, 2004
work page 2004
-
[12]
D´ aniel Fogaras, Bal´ azs R´ acz, K´ aroly Csalog´ any, and Tam´ as Sarl´ os. Towards scaling fully personalized pagerank: Algorithms, lower bounds, and experiments.Internet Mathematics, 2, 01 2005
work page 2005
-
[13]
Kimon Fountoulakis, Farbod Roosta-Khorasani, Julian Shun, Xiang Cheng, and Michael W. Mahoney. Variational perspective on local graph clustering.Math. Program., 174(1-2):553–573, 2019
work page 2019
-
[14]
David F. Gleich. Pagerank beyond the web.SIAM Rev., 57(3):321–363, 2015
work page 2015
-
[15]
David F. Gleich and Marzia Polito. Approximating personalized pagerank with minimal use of web graph data.Internet Math., 3(3):257–294, 2007
work page 2007
-
[16]
Property testing in bounded degree graphs.Algorithmica, 32(2):302–343, 2002
Oded Goldreich and Dana Ron. Property testing in bounded degree graphs.Algorithmica, 32(2):302–343, 2002
work page 2002
-
[17]
Parallel personalized pagerank on dynamic graphs.Proc
Wentian Guo, Yuchen Li, Mo Sha, and Kian-Lee Tan. Parallel personalized pagerank on dynamic graphs.Proc. VLDB Endow., 11(1):93–106, 2017
work page 2017
-
[18]
WTF: the who to follow service at twitter
Pankaj Gupta, Ashish Goel, Jimmy Lin, Aneesh Sharma, Dong Wang, and Reza Zadeh. WTF: the who to follow service at twitter. In Daniel Schwabe, Virg´ ılio A. F. Almeida, Hartmut Glaser, Ricardo Baeza-Yates, and Sue B. Moon, editors,22nd International World Wide Web Conference, WWW ’13, Rio de Janeiro, Brazil, May 13-17, 2013, pages 505–514. International Wo...
work page 2013
-
[19]
Zolt´ an Gy¨ ongyi, Hector Garcia-Molina, and Jan O. Pedersen. Combating web spam with trustrank. In Mario A. Nascimento, M. Tamer ¨Ozsu, Donald Kossmann, Ren´ ee J. Miller, Jos´ e A. Blakeley, and K. Bernhard Schiefer, editors,(e)Proceedings of the Thirtieth Inter- national Conference on Very Large Data Bases, VLDB 2004, Toronto, Canada, August 31 - Sept...
work page 2004
-
[20]
Dynamic pagerank: Algorithms and lower bounds
Rajesh Jayaram, Jakub Lacki, Slobodan Mitrovic, Krzysztof Onak, and Piotr Sankowski. Dynamic pagerank: Algorithms and lower bounds. In Karl Bringmann, Martin Grohe, Gabriele Puppis, and Ola Svensson, editors,51st International Colloquium on Automata, Languages, and Programming, ICALP 2024, July 8-12, 2024, Tallinn, Estonia, volume 297 ofLIPIcs, pages 90:1...
work page 2024
-
[21]
Scaling personalized web search
Glen Jeh and Jennifer Widom. Scaling personalized web search. In Guszt´ av Hencsey, Bebo White, Yih-Farn Robin Chen, L´ aszl´ o Kov´ acs, and Steve Lawrence, editors,Proceedings of the Twelfth International World Wide Web Conference, WWW 2003, Budapest, Hungary, May 20-24, 2003, pages 271–279. ACM, 2003
work page 2003
-
[22]
Simulating random walks on graphs in the streaming model
Ce Jin. Simulating random walks on graphs in the streaming model. In Avrim Blum, editor, 10th Innovations in Theoretical Computer Science Conference, ITCS 2019, January 10-12, 2019, San Diego, California, USA, volume 124 ofLIPIcs, pages 46:1–46:15. Schloss Dagstuhl - Leibniz-Zentrum f¨ ur Informatik, 2019
work page 2019
-
[23]
Kelner, Lorenzo Orecchia, Aaron Sidford, and Zeyuan Allen Zhu
Jonathan A. Kelner, Lorenzo Orecchia, Aaron Sidford, and Zeyuan Allen Zhu. A simple, combinatorial algorithm for solving SDD systems in nearly-linear time. In Dan Boneh, Tim Roughgarden, and Joan Feigenbaum, editors,Symposium on Theory of Computing Conference, STOC’13, Palo Alto, CA, USA, June 1-4, 2013, pages 911–920. ACM, 2013
work page 2013
-
[24]
Ioannis Koutis, Gary L. Miller, and Richard Peng. A nearly-m log n time solver for SDD linear systems. In Rafail Ostrovsky, editor,IEEE 52nd Annual Symposium on Foundations of Computer Science, FOCS 2011, Palm Springs, CA, USA, October 22-25, 2011, pages 590–598. IEEE Computer Society, 2011
work page 2011
-
[25]
Walking randomly, massively, and efficiently
Jakub Lacki, Slobodan Mitrovic, Krzysztof Onak, and Piotr Sankowski. Walking randomly, massively, and efficiently. In Konstantin Makarychev, Yury Makarychev, Madhur Tulsiani, Gautam Kamath, and Julia Chuzhoy, editors,Proceedings of the 52nd Annual ACM SIGACT Symposium on Theory of Computing, STOC 2020, Chicago, IL, USA, June 22-26, 2020, pages 364–377. ACM, 2020
work page 2020
-
[26]
Bidirectional pagerank estimation: From average-case to worst-case
Peter Lofgren, Siddhartha Banerjee, and Ashish Goel. Bidirectional pagerank estimation: From average-case to worst-case. In David F. Gleich, J´ ulia Komj´ athy, and Nelly Litvak, editors,Algorithms and Models for the Web Graph - 12th International Workshop, WA W 2015, Eindhoven, The Netherlands, December 10-11, 2015, Proceedings, volume 9479 ofLecture Not...
work page 2015
-
[27]
Personalized pagerank estimation and search: A bidirectional approach
Peter Lofgren, Siddhartha Banerjee, and Ashish Goel. Personalized pagerank estimation and search: A bidirectional approach. In Paul N. Bennett, Vanja Josifovski, Jennifer Neville, and Filip Radlinski, editors,Proceedings of the Ninth ACM International Conference on Web Search and Data Mining, San Francisco, CA, USA, February 22-25, 2016, pages 163–172. AC...
work page 2016
-
[28]
FAST-PPR: scaling personalized pagerank estimation for large graphs
Peter Lofgren, Siddhartha Banerjee, Ashish Goel, and Seshadhri Comandur. FAST-PPR: scaling personalized pagerank estimation for large graphs. In Sofus A. Macskassy, Claudia Perlich, Jure Leskovec, Wei Wang, and Rayid Ghani, editors,The 20th ACM SIGKDD In- ternational Conference on Knowledge Discovery and Data Mining, KDD ’14, New York, NY, USA - August 24...
work page 2014
-
[29]
Personalized PageRank to a Target Node
Peter Lofgren and Ashish Goel. Personalized pagerank to a target node.CoRR, abs/1304.4658, 2013
work page internal anchor Pith review Pith/arXiv arXiv 2013
-
[30]
Computing per- sonalized pagerank quickly by exploiting graph structures.Proc
Takanori Maehara, Takuya Akiba, Yoichi Iwata, and Ken-ichi Kawarabayashi. Computing per- sonalized pagerank quickly by exploiting graph structures.Proc. VLDB Endow., 7(12):1023– 1034, 2014
work page 2014
-
[31]
The pagerank citation ranking: Bringing order to the web
Lawrence Page, Sergey Brin, Rajeev Motwani, and Terry Winograd. The pagerank citation ranking: Bringing order to the web. Technical report, Stanford infolab, 1999
work page 1999
-
[32]
Daniel A. Spielman and Shang-Hua Teng. Nearly-linear time algorithms for graph partitioning, graph sparsification, and solving linear systems. In L´ aszl´ o Babai, editor,Proceedings of the 36th Annual ACM Symposium on Theory of Computing, Chicago, IL, USA, June 13-16, 2004, pages 81–90. ACM, 2004
work page 2004
-
[33]
Revisiting local pagerank estimation on undirected graphs: Simple and optimal
Hanzhi Wang. Revisiting local pagerank estimation on undirected graphs: Simple and optimal. In Ricardo Baeza-Yates and Francesco Bonchi, editors,Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, KDD 2024, Barcelona, Spain, August 25-29, 2024, pages 3036–3044. ACM, 2024
work page 2024
-
[34]
Hanzhi Wang, Mingguo He, Zhewei Wei, Sibo Wang, Ye Yuan, Xiaoyong Du, and Ji-Rong Wen. Approximate graph propagation. In Feida Zhu, Beng Chin Ooi, and Chunyan Miao, editors,KDD ’21: The 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Virtual Event, Singapore, August 14-18, 2021, pages 1686–1696. ACM, 2021
work page 2021
-
[35]
Personalized pagerank to a target node, revisited
Hanzhi Wang, Zhewei Wei, Junhao Gan, Sibo Wang, and Zengfeng Huang. Personalized pagerank to a target node, revisited. In Rajesh Gupta, Yan Liu, Jiliang Tang, and B. Aditya Prakash, editors,KDD ’20: The 26th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Virtual Event, CA, USA, August 23-27, 2020, pages 657–667. ACM, 2020
work page 2020
-
[36]
Edge-based local push for personalized pagerank.Proc
Hanzhi Wang, Zhewei Wei, Junhao Gan, Ye Yuan, Xiaoyong Du, and Ji-Rong Wen. Edge-based local push for personalized pagerank.Proc. VLDB Endow., 15(7):1376–1389, 2022
work page 2022
-
[37]
Revisiting local computation of pagerank: Simple and optimal
Hanzhi Wang, Zhewei Wei, Ji-Rong Wen, and Mingji Yang. Revisiting local computation of pagerank: Simple and optimal. In Bojan Mohar, Igor Shinkar, and Ryan O’Donnell, edi- tors,Proceedings of the 56th Annual ACM Symposium on Theory of Computing, STOC 2024, Vancouver, BC, Canada, June 24-28, 2024, pages 911–922. ACM, 2024
work page 2024
-
[38]
Efficient algorithms for finding approximate heavy hitters in per- sonalized pageranks
Sibo Wang and Yufei Tao. Efficient algorithms for finding approximate heavy hitters in per- sonalized pageranks. In Gautam Das, Christopher M. Jermaine, and Philip A. Bernstein, editors,Proceedings of the 2018 International Conference on Management of Data, SIGMOD Conference 2018, Houston, TX, USA, June 10-15, 2018, pages 1113–1127. ACM, 2018
work page 2018
-
[39]
Efficient algorithms for approximate single-source personalized pagerank queries.ACM Trans
Sibo Wang, Renchi Yang, Runhui Wang, Xiaokui Xiao, Zhewei Wei, Wenqing Lin, Yin Yang, and Nan Tang. Efficient algorithms for approximate single-source personalized pagerank queries.ACM Trans. Database Syst., 44(4):18:1–18:37, 2019. 36
work page 2019
-
[40]
FORA: simple and effective approximate single-source personalized pagerank
Sibo Wang, Renchi Yang, Xiaokui Xiao, Zhewei Wei, and Yin Yang. FORA: simple and effective approximate single-source personalized pagerank. InProceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Halifax, NS, Canada, August 13 - 17, 2017, pages 505–514. ACM, 2017
work page 2017
-
[41]
Approximating single-source personalized pager- ank with absolute error guarantees
Zhewei Wei, Ji-Rong Wen, and Mingji Yang. Approximating single-source personalized pager- ank with absolute error guarantees. In Graham Cormode and Michael Shekelyan, editors, 27th International Conference on Database Theory, ICDT 2024, March 25-28, 2024, Paes- tum, Italy, volume 290 ofLIPIcs, pages 9:1–9:19. Schloss Dagstuhl - Leibniz-Zentrum f¨ ur Infor...
work page 2024
-
[42]
Unifying the global and local approaches: An efficient power iteration with forward push
Hao Wu, Junhao Gan, Zhewei Wei, and Rui Zhang. Unifying the global and local approaches: An efficient power iteration with forward push. In Guoliang Li, Zhanhuai Li, Stratos Idreos, and Divesh Srivastava, editors,SIGMOD ’21: International Conference on Management of Data, Virtual Event, China, June 20-25, 2021, pages 1996–2008. ACM, 2021
work page 2021
-
[43]
Efficient algorithms for personalized pagerank computation: A survey.IEEE Trans
Mingji Yang, Hanzhi Wang, Zhewei Wei, Sibo Wang, and Ji-Rong Wen. Efficient algorithms for personalized pagerank computation: A survey.IEEE Trans. Knowl. Data Eng., 36(9):4582– 4602, 2024
work page 2024
-
[44]
Andrew Chi-Chih Yao. Probabilistic computations: Toward a unified measure of complexity (extended abstract). In18th Annual Symposium on Foundations of Computer Science, Prov- idence, Rhode Island, USA, 31 October - 1 November 1977, pages 222–227. IEEE Computer Society, 1977. 37 A Table of Notations Table 2: Table of notations. Notation Description G= (V, ...
work page 1977
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.