Revealing graph bandits for maximizing local influence
Pith reviewed 2026-05-09 20:30 UTC · model grok-4.3
The pith
A bandit algorithm identifies the most influential node in an unknown graph by querying nodes and receiving stochastic influence sets, with regret scaling by detectable dimension.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
In the graph bandit setting for local influence maximization, the learner selects nodes sequentially and observes stochastic influence sets to identify the global maximum-influence node. BARE is a bandit strategy that achieves a regret bound scaling with the detectable dimension, a quantity that captures the effective complexity of the graph based on how much structure can be revealed through the observations.
What carries the argument
The detectable dimension, a problem-dependent quantity measuring how much the influence observations allow revealing the graph structure and distinguishing the most influential node.
Load-bearing premise
Querying a node must produce a stochastic set of currently influenced nodes that carries enough information to gradually reveal the graph structure and identify the global most-influential node; if the observations are too noisy, the detectable dimension may not stay small.
What would settle it
Running BARE on a graph family where the detectable dimension is expected to be small and finding that empirical regret instead grows linearly with the number of nodes would falsify the scaling claim.
Figures
read the original abstract
We study a graph bandit setting where the objective of the learner is to detect the most influential node of a graph by requesting as little information from the graph as possible. One of the relevant applications for this setting is marketing in social networks, where the marketer aims at finding and taking advantage of the most influential customers. The existing approaches for bandit problems on graphs require either partial or complete knowledge of the graph. In this paper, we do not assume any knowledge of the graph, but we consider a setting where it can be gradually discovered in a sequential and active way. At each round, the learner chooses a node of the graph and the only information it receives is a stochastic set of the nodes that the chosen node is currently influencing. To address this setting, we propose BARE, a bandit strategy for which we prove a regret guarantee that scales with the detectable dimension, a problem dependent quantity that is often much smaller than the number of nodes.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces a graph bandit problem in which the learner must identify the most influential node by sequentially querying nodes and receiving only stochastic sets of nodes currently influenced by the queried node, with no prior knowledge of the graph structure. The authors propose the BARE algorithm and claim a regret bound that scales with a problem-dependent 'detectable dimension' rather than the ambient number of nodes n, with applications to influence maximization in social networks.
Significance. If the regret analysis holds and the detectable dimension is indeed small for typical graphs, the result would be significant for bandit algorithms on graphs: it removes the common assumption of known or partially known graph structure and ties performance to an intrinsic, observable quantity. The active-query model that gradually reveals influence sets is a natural fit for the application and the problem-dependent bound is a clear strength when the observations are sufficiently informative.
minor comments (3)
- The abstract and introduction state that a regret guarantee is proved for BARE, but the main text should include an explicit statement of the theorem (including the precise dependence on the detectable dimension) immediately after the algorithm description rather than deferring all details to an appendix.
- The definition of 'detectable dimension' is central to the claim; it should be given a numbered definition and illustrated with a small example graph in Section 3 or 4 so that readers can verify when the quantity is indeed much smaller than n.
- The experimental section would benefit from a plot or table showing the empirical growth of the detectable dimension across the tested graphs, to support the claim that it remains small in practice.
Simulated Author's Rebuttal
We thank the referee for their positive summary of our work on the graph bandit problem for identifying the most influential node and for recommending minor revision. We appreciate the recognition that the active-query model and the problem-dependent regret bound scaling with the detectable dimension represent a strength when observations are informative.
Circularity Check
No significant circularity in derivation chain
full rationale
The paper introduces BARE for graph bandits without prior graph knowledge, where queries return stochastic influence sets, and claims a regret bound scaling with the detectable dimension (a problem-dependent quantity smaller than node count). No equations, fitted parameters, or self-citations are visible in the provided abstract or summary that reduce the regret claim to a self-defined input by construction. The detectable dimension is presented as an external property of the observation model rather than redefined from the algorithm's output or regret itself. The derivation appears self-contained against the stated assumptions, with the regret scaling serving as a standard problem-dependent analysis rather than a tautological renaming or fitted prediction.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
From bandits to experts: A tale of domination and independence
Alon, Noga, Cesa-Bianchi, Nicol \` o , Gentile, Claudio, and Mansour, Yishay. From bandits to experts: A tale of domination and independence . In Neural Information Processing Systems, 2013
work page 2013
-
[2]
Online learning with feedback graphs: Beyond bandits
Alon, Noga, Cesa-Bianchi, Nicol \` o , Dekel, Ofer, and Koren, Tomer. Online learning with feedback graphs: Beyond bandits . In Conference on Learning Theory, 2015
work page 2015
-
[3]
Minimax policies for adversarial and stochastic bandits
Audibert, Jean-Yves and Bubeck, S \' e bastien. Minimax policies for adversarial and stochastic bandits . In Conference on Learning Theory, 2009
work page 2009
-
[4]
Emergence of scaling in random networks
Barab \' a si, Albert-L \' a szl \' o and Albert, R \' e ka. Emergence of scaling in random networks . Science, 286: 0 11, 1999
work page 1999
-
[5]
Bandit algorithms for social network queries
Bnaya, Zahy, Puzis, Rami, Stern, Roni, and Felner, Ariel. Bandit algorithms for social network queries . In International Conference on Social Computing, 2013 a
work page 2013
-
[6]
Social network search as a volatile multi-armed bandit problem
Bnaya, Zahy, Puzis, Rami, Stern, Roni, and Felner, Ariel. Social network search as a volatile multi-armed bandit problem . Human Journal, 2 0 (2): 0 84--98, 2013 b
work page 2013
-
[7]
Stochastic bandits with side observations on networks
Buccapatnam, Swapna, Eryilmaz, Atilla, and Shroff, Ness B. Stochastic bandits with side observations on networks . In International Conference on Measurement and Modeling of Computer Systems, 2014
work page 2014
-
[8]
Leveraging side observations in stochastic bandits
Caron, St \' e phane, Kveton, Branislav, Lelarge, Marc, and Bhagat, Smriti. Leveraging side observations in stochastic bandits. In Uncertainty in Artificial Intelligence, 2012
work page 2012
-
[9]
Cesa-Bianchi, Nicol \` o , Gentile, Claudio, and Zappella, Giovanni. A gang of bandits . In Neural Information Processing Systems, 2013
work page 2013
-
[10]
Combinatorial multi-armed bandit and its extension to probabilistically triggered arms
Chen, Wei, Wang, Yajun, and Yuan, Yang. Combinatorial multi-armed bandit and its extension to probabilistically triggered arms . Technical report, 2015
work page 2015
-
[11]
Unimodal bandits: Regret lower bounds and optimal algorithms
Combes, Richard and Prouti \` e re, Alexandre. Unimodal bandits: Regret lower bounds and optimal algorithms . In International Conference on Machine Learning, 2014
work page 2014
-
[12]
Networked bandits with disjoint linear payoffs
Fang, Meng and Tao, Dacheng. Networked bandits with disjoint linear payoffs . In Internattional Conference on Knowledge Discovery and Data Mining, 2014
work page 2014
-
[13]
Gentile, Claudio, Li, Shuai, and Zappella, Giovanni. Online clustering of bandits . In International Conference on Machine Learning, 2014
work page 2014
-
[14]
Community structure in social and biological networks
Girvan, Michelle and Newman, Mark E J. Community structure in social and biological networks. National Academy of Sciences of the United States of America, 99 0 (12): 0 7821--6, 2002
work page 2002
-
[15]
Online spectral learning on a graph with bandit feedback
Gu, Quanquan and Han, Jiawei. Online spectral learning on a graph with bandit feedback . In International Conference on Data Mining, 2014
work page 2014
-
[16]
Maximizing the spread of influence through a social network
Kempe, David, Kleinberg, Jon, and Tardos, \' E va. Maximizing the spread of influence through a social network . Knowledge Discovery and Data mining, pp.\ 137, 2003
work page 2003
-
[17]
Maximizing the spread of influence through a social network
Kempe, David, Kleinberg, Jon, and Tardos, \' E va. Maximizing the spread of influence through a social network . Theory of Computing, 11 0 (4): 0 105--147, 2015
work page 2015
-
[18]
Klimt, Bryan and Yang, Yiming. Introducing the Enron corpus . In Collaboration, Electronic messaging, Anti-Abuse and Spam Conference, 2004
work page 2004
-
[19]
Efficient learning by implicit exploration in bandit problems with side observations
Koc \' a k, Tom \' a s , Neu, Gergely, Valko, Michal, and Munos, R \' e mi. Efficient learning by implicit exploration in bandit problems with side observations . In Neural Information Processing Systems, 2014 a
work page 2014
-
[20]
Koc \' a k, Tom \' a s , Valko, Michal, Munos, R \' e mi, and Agrawal, Shipra. Spectral Thompson sampling . In AAAI Conference on Artificial Intelligence, 2014 b
work page 2014
-
[21]
Online learning with noisy side observations
Koc \' a k, Tom \' a s , Neu, Gergely, and Valko, Michal. Online learning with noisy side observations . In International Conference on Artificial Intelligence and Statistics, 2016
work page 2016
-
[22]
Asymptotically efficient adaptive allocation rules
Lai, Tze L and Robbins, Herbert. Asymptotically efficient adaptive allocation rules . Advances in Applied Mathematics, 6 0 (1): 0 4--22, 1985
work page 1985
-
[23]
Lei, Siyu, Maniu, Silviu, Mo, Luyi, Cheng, Reynold, and Senellart, Pierre. Online influence maximization . In Knowledge Discovery and Data mining, 2015
work page 2015
-
[24]
SNAP datasets: Stanford large network dataset collection
Leskovec, Jure and Krevl, Andrej. SNAP datasets: Stanford large network dataset collection . http://snap.stanford.edu/data, 2014
work page 2014
-
[25]
From bandits to experts: On the value of side-observations
Mannor, Shie and Shamir, Ohad. From bandits to experts: On the value of side-observations . In Neural Information Processing Systems, 2011
work page 2011
-
[26]
Ripeanu, Matei, Iamnitchi, Adriana, and Foster, Ian. Mapping the Gnutella network . IEEE Internet Computing, 6 0 (1): 0 50--57, 2002
work page 2002
-
[27]
Information gathering in networks via active exploration
Singla, Adish, Horvitz, Eric, Kohli, Pushmeet, White, Ryen, and Krause, Andreas. Information gathering in networks via active exploration . In International Joint Conferences on Artificial Intelligence, 2015
work page 2015
-
[28]
Spectral bandits for smooth graph functions
Valko, Michal, Munos, R \' e mi, Kveton, Branislav, and Koc \' a k, Tom \' a s . Spectral bandits for smooth graph functions . In International Conference on Machine Learning, 2014
work page 2014
- [29]
-
[30]
On the evolution of user interaction in facebook
Viswanath, Bimal, Mislove, Alan, Cha, Meeyoung, and Gummadi, Krishna P. On the evolution of user interaction in facebook . In ACM Workshop on Online Social Networks, 2009
work page 2009
-
[31]
Online learning with Gaussian payoffs and side observations
Wu, Yifan, Gy \" o rgy, Andr \' a s, and Szepesv \' a ri, Csaba. Online learning with Gaussian payoffs and side observations . In Neural Information Processing Systems, 2015
work page 2015
-
[32]
Yu, Jia Yuan and Mannor, Shie. Unimodal bandits . In International Conference on Machine Learning, 2011
work page 2011
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.