Revealing graph bandits for maximizing local influence

Alexandra Carpentier; Michal Valko

arxiv: 2605.00489 · v1 · submitted 2026-05-01 · 💻 cs.LG

Revealing graph bandits for maximizing local influence

Alexandra Carpentier , Michal Valko This is my paper

Pith reviewed 2026-05-09 20:30 UTC · model grok-4.3

classification 💻 cs.LG

keywords graph banditsinfluence maximizationbandit algorithmsregret boundssocial networksdetectable dimensionactive learning

0 comments

The pith

A bandit algorithm identifies the most influential node in an unknown graph by querying nodes and receiving stochastic influence sets, with regret scaling by detectable dimension.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper studies a graph bandit problem aimed at locating the single most influential node while learning the graph structure on the fly. The learner picks a node each round and receives only a random subset of nodes currently influenced by that choice. This setup models scenarios such as social-network marketing where the full graph is unavailable but partial influence feedback can be observed sequentially. The authors present the BARE strategy and prove a regret bound that depends on the detectable dimension, a graph-specific quantity that is frequently much smaller than the total number of nodes.

Core claim

In the graph bandit setting for local influence maximization, the learner selects nodes sequentially and observes stochastic influence sets to identify the global maximum-influence node. BARE is a bandit strategy that achieves a regret bound scaling with the detectable dimension, a quantity that captures the effective complexity of the graph based on how much structure can be revealed through the observations.

What carries the argument

The detectable dimension, a problem-dependent quantity measuring how much the influence observations allow revealing the graph structure and distinguishing the most influential node.

Load-bearing premise

Querying a node must produce a stochastic set of currently influenced nodes that carries enough information to gradually reveal the graph structure and identify the global most-influential node; if the observations are too noisy, the detectable dimension may not stay small.

What would settle it

Running BARE on a graph family where the detectable dimension is expected to be small and finding that empirical regret instead grows linearly with the number of nodes would falsify the scaling claim.

Figures

Figures reproduced from arXiv: 2605.00489 by Alexandra Carpentier, Michal Valko.

**Figure 1.** Figure 1: Left: Barab´asi-Albert. Middle left: Facebook. Middle right: Enron. Right: Gnutella. round t regret Graph: Barabasi-Albert - Number of runs: 100 - revelation p = 0.20 0 200 400 600 800 1000 1200 0 0.5 1 1.5 2 2.5 3 3.5 4 × 104 BARE GraphMOSS Db⋆ = 529, Tb⋆ = 147 round t regret Graph: Barabasi-Albert - Number of runs: 100 - revelation p = 0.40 0 200 400 600 800 1000 1200 0 1 2 3 4 5 6 × 104 BARE GraphMOSS D… view at source ↗

**Figure 2.** Figure 2: Barab´asi-Albert model with varying p between 0.2 and 1 We first performed an experiment on a graph generated by 10-out-degree Barab´asi-Albert model with d = 1000 nodes [PITH_FULL_IMAGE:figures/full_fig_p008_2.png] view at source ↗

**Figure 3.** Figure 3: Left: Star graph. Right: Complete graph. In [PITH_FULL_IMAGE:figures/full_fig_p011_3.png] view at source ↗

**Figure 4.** Figure 4: Left: Barab´asi-Albert. Middle left: Facebook. Middle right: Enron. Right: Gnutella. 1 If we consider only the information we obtain for a node from the graph, and not the information we obtain from pulling the node, the error is at least d p ε/n even at the end of the budget, which is higher than the gap q rD/n, and therefore the information coming from the graph structure does not significantly help. 2We… view at source ↗

read the original abstract

We study a graph bandit setting where the objective of the learner is to detect the most influential node of a graph by requesting as little information from the graph as possible. One of the relevant applications for this setting is marketing in social networks, where the marketer aims at finding and taking advantage of the most influential customers. The existing approaches for bandit problems on graphs require either partial or complete knowledge of the graph. In this paper, we do not assume any knowledge of the graph, but we consider a setting where it can be gradually discovered in a sequential and active way. At each round, the learner chooses a node of the graph and the only information it receives is a stochastic set of the nodes that the chosen node is currently influencing. To address this setting, we propose BARE, a bandit strategy for which we prove a regret guarantee that scales with the detectable dimension, a problem dependent quantity that is often much smaller than the number of nodes.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper introduces a zero-knowledge graph bandit setting for influence maximization where queries return stochastic influence sets, and BARE achieves regret scaling with a detectable dimension smaller than n.

read the letter

The main takeaway is that this paper defines a bandit problem for finding the most influential node in a graph with no initial knowledge at all. You pick nodes to query and receive only a random subset of nodes currently influenced by that choice, and the learner has to identify the global best one while keeping regret low. They propose BARE and claim a regret bound that depends on this detectable dimension rather than the full node count.

Referee Report

0 major / 3 minor

Summary. The paper introduces a graph bandit problem in which the learner must identify the most influential node by sequentially querying nodes and receiving only stochastic sets of nodes currently influenced by the queried node, with no prior knowledge of the graph structure. The authors propose the BARE algorithm and claim a regret bound that scales with a problem-dependent 'detectable dimension' rather than the ambient number of nodes n, with applications to influence maximization in social networks.

Significance. If the regret analysis holds and the detectable dimension is indeed small for typical graphs, the result would be significant for bandit algorithms on graphs: it removes the common assumption of known or partially known graph structure and ties performance to an intrinsic, observable quantity. The active-query model that gradually reveals influence sets is a natural fit for the application and the problem-dependent bound is a clear strength when the observations are sufficiently informative.

minor comments (3)

The abstract and introduction state that a regret guarantee is proved for BARE, but the main text should include an explicit statement of the theorem (including the precise dependence on the detectable dimension) immediately after the algorithm description rather than deferring all details to an appendix.
The definition of 'detectable dimension' is central to the claim; it should be given a numbered definition and illustrated with a small example graph in Section 3 or 4 so that readers can verify when the quantity is indeed much smaller than n.
The experimental section would benefit from a plot or table showing the empirical growth of the detectable dimension across the tested graphs, to support the claim that it remains small in practice.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for their positive summary of our work on the graph bandit problem for identifying the most influential node and for recommending minor revision. We appreciate the recognition that the active-query model and the problem-dependent regret bound scaling with the detectable dimension represent a strength when observations are informative.

Circularity Check

0 steps flagged

No significant circularity in derivation chain

full rationale

The paper introduces BARE for graph bandits without prior graph knowledge, where queries return stochastic influence sets, and claims a regret bound scaling with the detectable dimension (a problem-dependent quantity smaller than node count). No equations, fitted parameters, or self-citations are visible in the provided abstract or summary that reduce the regret claim to a self-defined input by construction. The detectable dimension is presented as an external property of the observation model rather than redefined from the algorithm's output or regret itself. The derivation appears self-contained against the stated assumptions, with the regret scaling serving as a standard problem-dependent analysis rather than a tautological renaming or fitted prediction.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Review based solely on abstract; no explicit free parameters, axioms, or invented entities are stated.

pith-pipeline@v0.9.0 · 5452 in / 1112 out tokens · 22649 ms · 2026-05-09T20:30:26.241929+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

32 extracted references · 32 canonical work pages

[1]

From bandits to experts: A tale of domination and independence

Alon, Noga, Cesa-Bianchi, Nicol \` o , Gentile, Claudio, and Mansour, Yishay. From bandits to experts: A tale of domination and independence . In Neural Information Processing Systems, 2013

work page 2013
[2]

Online learning with feedback graphs: Beyond bandits

Alon, Noga, Cesa-Bianchi, Nicol \` o , Dekel, Ofer, and Koren, Tomer. Online learning with feedback graphs: Beyond bandits . In Conference on Learning Theory, 2015

work page 2015
[3]

Minimax policies for adversarial and stochastic bandits

Audibert, Jean-Yves and Bubeck, S \' e bastien. Minimax policies for adversarial and stochastic bandits . In Conference on Learning Theory, 2009

work page 2009
[4]

Emergence of scaling in random networks

Barab \' a si, Albert-L \' a szl \' o and Albert, R \' e ka. Emergence of scaling in random networks . Science, 286: 0 11, 1999

work page 1999
[5]

Bandit algorithms for social network queries

Bnaya, Zahy, Puzis, Rami, Stern, Roni, and Felner, Ariel. Bandit algorithms for social network queries . In International Conference on Social Computing, 2013 a

work page 2013
[6]

Social network search as a volatile multi-armed bandit problem

Bnaya, Zahy, Puzis, Rami, Stern, Roni, and Felner, Ariel. Social network search as a volatile multi-armed bandit problem . Human Journal, 2 0 (2): 0 84--98, 2013 b

work page 2013
[7]

Stochastic bandits with side observations on networks

Buccapatnam, Swapna, Eryilmaz, Atilla, and Shroff, Ness B. Stochastic bandits with side observations on networks . In International Conference on Measurement and Modeling of Computer Systems, 2014

work page 2014
[8]

Leveraging side observations in stochastic bandits

Caron, St \' e phane, Kveton, Branislav, Lelarge, Marc, and Bhagat, Smriti. Leveraging side observations in stochastic bandits. In Uncertainty in Artificial Intelligence, 2012

work page 2012
[9]

A gang of bandits

Cesa-Bianchi, Nicol \` o , Gentile, Claudio, and Zappella, Giovanni. A gang of bandits . In Neural Information Processing Systems, 2013

work page 2013
[10]

Combinatorial multi-armed bandit and its extension to probabilistically triggered arms

Chen, Wei, Wang, Yajun, and Yuan, Yang. Combinatorial multi-armed bandit and its extension to probabilistically triggered arms . Technical report, 2015

work page 2015
[11]

Unimodal bandits: Regret lower bounds and optimal algorithms

Combes, Richard and Prouti \` e re, Alexandre. Unimodal bandits: Regret lower bounds and optimal algorithms . In International Conference on Machine Learning, 2014

work page 2014
[12]

Networked bandits with disjoint linear payoffs

Fang, Meng and Tao, Dacheng. Networked bandits with disjoint linear payoffs . In Internattional Conference on Knowledge Discovery and Data Mining, 2014

work page 2014
[13]

Online clustering of bandits

Gentile, Claudio, Li, Shuai, and Zappella, Giovanni. Online clustering of bandits . In International Conference on Machine Learning, 2014

work page 2014
[14]

Community structure in social and biological networks

Girvan, Michelle and Newman, Mark E J. Community structure in social and biological networks. National Academy of Sciences of the United States of America, 99 0 (12): 0 7821--6, 2002

work page 2002
[15]

Online spectral learning on a graph with bandit feedback

Gu, Quanquan and Han, Jiawei. Online spectral learning on a graph with bandit feedback . In International Conference on Data Mining, 2014

work page 2014
[16]

Maximizing the spread of influence through a social network

Kempe, David, Kleinberg, Jon, and Tardos, \' E va. Maximizing the spread of influence through a social network . Knowledge Discovery and Data mining, pp.\ 137, 2003

work page 2003
[17]

Maximizing the spread of influence through a social network

Kempe, David, Kleinberg, Jon, and Tardos, \' E va. Maximizing the spread of influence through a social network . Theory of Computing, 11 0 (4): 0 105--147, 2015

work page 2015
[18]

Introducing the Enron corpus

Klimt, Bryan and Yang, Yiming. Introducing the Enron corpus . In Collaboration, Electronic messaging, Anti-Abuse and Spam Conference, 2004

work page 2004
[19]

Efficient learning by implicit exploration in bandit problems with side observations

Koc \' a k, Tom \' a s , Neu, Gergely, Valko, Michal, and Munos, R \' e mi. Efficient learning by implicit exploration in bandit problems with side observations . In Neural Information Processing Systems, 2014 a

work page 2014
[20]

Spectral Thompson sampling

Koc \' a k, Tom \' a s , Valko, Michal, Munos, R \' e mi, and Agrawal, Shipra. Spectral Thompson sampling . In AAAI Conference on Artificial Intelligence, 2014 b

work page 2014
[21]

Online learning with noisy side observations

Koc \' a k, Tom \' a s , Neu, Gergely, and Valko, Michal. Online learning with noisy side observations . In International Conference on Artificial Intelligence and Statistics, 2016

work page 2016
[22]

Asymptotically efficient adaptive allocation rules

Lai, Tze L and Robbins, Herbert. Asymptotically efficient adaptive allocation rules . Advances in Applied Mathematics, 6 0 (1): 0 4--22, 1985

work page 1985
[23]

Online influence maximization

Lei, Siyu, Maniu, Silviu, Mo, Luyi, Cheng, Reynold, and Senellart, Pierre. Online influence maximization . In Knowledge Discovery and Data mining, 2015

work page 2015
[24]

SNAP datasets: Stanford large network dataset collection

Leskovec, Jure and Krevl, Andrej. SNAP datasets: Stanford large network dataset collection . http://snap.stanford.edu/data, 2014

work page 2014
[25]

From bandits to experts: On the value of side-observations

Mannor, Shie and Shamir, Ohad. From bandits to experts: On the value of side-observations . In Neural Information Processing Systems, 2011

work page 2011
[26]

Mapping the Gnutella network

Ripeanu, Matei, Iamnitchi, Adriana, and Foster, Ian. Mapping the Gnutella network . IEEE Internet Computing, 6 0 (1): 0 50--57, 2002

work page 2002
[27]

Information gathering in networks via active exploration

Singla, Adish, Horvitz, Eric, Kohli, Pushmeet, White, Ryen, and Krause, Andreas. Information gathering in networks via active exploration . In International Joint Conferences on Artificial Intelligence, 2015

work page 2015
[28]

Spectral bandits for smooth graph functions

Valko, Michal, Munos, R \' e mi, Kveton, Branislav, and Koc \' a k, Tom \' a s . Spectral bandits for smooth graph functions . In International Conference on Machine Learning, 2014

work page 2014
[29]

Vaswani, Sharan and Lakshmanan, Laks. V. S. Influence maximization with bandits . Technical report, http://arxiv.org/abs/1503.00024, 2015

work page arXiv 2015
[30]

On the evolution of user interaction in facebook

Viswanath, Bimal, Mislove, Alan, Cha, Meeyoung, and Gummadi, Krishna P. On the evolution of user interaction in facebook . In ACM Workshop on Online Social Networks, 2009

work page 2009
[31]

Online learning with Gaussian payoffs and side observations

Wu, Yifan, Gy \" o rgy, Andr \' a s, and Szepesv \' a ri, Csaba. Online learning with Gaussian payoffs and side observations . In Neural Information Processing Systems, 2015

work page 2015
[32]

Unimodal bandits

Yu, Jia Yuan and Mannor, Shie. Unimodal bandits . In International Conference on Machine Learning, 2011

work page 2011

[1] [1]

From bandits to experts: A tale of domination and independence

Alon, Noga, Cesa-Bianchi, Nicol \` o , Gentile, Claudio, and Mansour, Yishay. From bandits to experts: A tale of domination and independence . In Neural Information Processing Systems, 2013

work page 2013

[2] [2]

Online learning with feedback graphs: Beyond bandits

Alon, Noga, Cesa-Bianchi, Nicol \` o , Dekel, Ofer, and Koren, Tomer. Online learning with feedback graphs: Beyond bandits . In Conference on Learning Theory, 2015

work page 2015

[3] [3]

Minimax policies for adversarial and stochastic bandits

Audibert, Jean-Yves and Bubeck, S \' e bastien. Minimax policies for adversarial and stochastic bandits . In Conference on Learning Theory, 2009

work page 2009

[4] [4]

Emergence of scaling in random networks

Barab \' a si, Albert-L \' a szl \' o and Albert, R \' e ka. Emergence of scaling in random networks . Science, 286: 0 11, 1999

work page 1999

[5] [5]

Bandit algorithms for social network queries

Bnaya, Zahy, Puzis, Rami, Stern, Roni, and Felner, Ariel. Bandit algorithms for social network queries . In International Conference on Social Computing, 2013 a

work page 2013

[6] [6]

Social network search as a volatile multi-armed bandit problem

Bnaya, Zahy, Puzis, Rami, Stern, Roni, and Felner, Ariel. Social network search as a volatile multi-armed bandit problem . Human Journal, 2 0 (2): 0 84--98, 2013 b

work page 2013

[7] [7]

Stochastic bandits with side observations on networks

Buccapatnam, Swapna, Eryilmaz, Atilla, and Shroff, Ness B. Stochastic bandits with side observations on networks . In International Conference on Measurement and Modeling of Computer Systems, 2014

work page 2014

[8] [8]

Leveraging side observations in stochastic bandits

Caron, St \' e phane, Kveton, Branislav, Lelarge, Marc, and Bhagat, Smriti. Leveraging side observations in stochastic bandits. In Uncertainty in Artificial Intelligence, 2012

work page 2012

[9] [9]

A gang of bandits

Cesa-Bianchi, Nicol \` o , Gentile, Claudio, and Zappella, Giovanni. A gang of bandits . In Neural Information Processing Systems, 2013

work page 2013

[10] [10]

Combinatorial multi-armed bandit and its extension to probabilistically triggered arms

Chen, Wei, Wang, Yajun, and Yuan, Yang. Combinatorial multi-armed bandit and its extension to probabilistically triggered arms . Technical report, 2015

work page 2015

[11] [11]

Unimodal bandits: Regret lower bounds and optimal algorithms

Combes, Richard and Prouti \` e re, Alexandre. Unimodal bandits: Regret lower bounds and optimal algorithms . In International Conference on Machine Learning, 2014

work page 2014

[12] [12]

Networked bandits with disjoint linear payoffs

Fang, Meng and Tao, Dacheng. Networked bandits with disjoint linear payoffs . In Internattional Conference on Knowledge Discovery and Data Mining, 2014

work page 2014

[13] [13]

Online clustering of bandits

Gentile, Claudio, Li, Shuai, and Zappella, Giovanni. Online clustering of bandits . In International Conference on Machine Learning, 2014

work page 2014

[14] [14]

Community structure in social and biological networks

Girvan, Michelle and Newman, Mark E J. Community structure in social and biological networks. National Academy of Sciences of the United States of America, 99 0 (12): 0 7821--6, 2002

work page 2002

[15] [15]

Online spectral learning on a graph with bandit feedback

Gu, Quanquan and Han, Jiawei. Online spectral learning on a graph with bandit feedback . In International Conference on Data Mining, 2014

work page 2014

[16] [16]

Maximizing the spread of influence through a social network

Kempe, David, Kleinberg, Jon, and Tardos, \' E va. Maximizing the spread of influence through a social network . Knowledge Discovery and Data mining, pp.\ 137, 2003

work page 2003

[17] [17]

Maximizing the spread of influence through a social network

Kempe, David, Kleinberg, Jon, and Tardos, \' E va. Maximizing the spread of influence through a social network . Theory of Computing, 11 0 (4): 0 105--147, 2015

work page 2015

[18] [18]

Introducing the Enron corpus

Klimt, Bryan and Yang, Yiming. Introducing the Enron corpus . In Collaboration, Electronic messaging, Anti-Abuse and Spam Conference, 2004

work page 2004

[19] [19]

Efficient learning by implicit exploration in bandit problems with side observations

Koc \' a k, Tom \' a s , Neu, Gergely, Valko, Michal, and Munos, R \' e mi. Efficient learning by implicit exploration in bandit problems with side observations . In Neural Information Processing Systems, 2014 a

work page 2014

[20] [20]

Spectral Thompson sampling

Koc \' a k, Tom \' a s , Valko, Michal, Munos, R \' e mi, and Agrawal, Shipra. Spectral Thompson sampling . In AAAI Conference on Artificial Intelligence, 2014 b

work page 2014

[21] [21]

Online learning with noisy side observations

Koc \' a k, Tom \' a s , Neu, Gergely, and Valko, Michal. Online learning with noisy side observations . In International Conference on Artificial Intelligence and Statistics, 2016

work page 2016

[22] [22]

Asymptotically efficient adaptive allocation rules

Lai, Tze L and Robbins, Herbert. Asymptotically efficient adaptive allocation rules . Advances in Applied Mathematics, 6 0 (1): 0 4--22, 1985

work page 1985

[23] [23]

Online influence maximization

Lei, Siyu, Maniu, Silviu, Mo, Luyi, Cheng, Reynold, and Senellart, Pierre. Online influence maximization . In Knowledge Discovery and Data mining, 2015

work page 2015

[24] [24]

SNAP datasets: Stanford large network dataset collection

Leskovec, Jure and Krevl, Andrej. SNAP datasets: Stanford large network dataset collection . http://snap.stanford.edu/data, 2014

work page 2014

[25] [25]

From bandits to experts: On the value of side-observations

Mannor, Shie and Shamir, Ohad. From bandits to experts: On the value of side-observations . In Neural Information Processing Systems, 2011

work page 2011

[26] [26]

Mapping the Gnutella network

Ripeanu, Matei, Iamnitchi, Adriana, and Foster, Ian. Mapping the Gnutella network . IEEE Internet Computing, 6 0 (1): 0 50--57, 2002

work page 2002

[27] [27]

Information gathering in networks via active exploration

Singla, Adish, Horvitz, Eric, Kohli, Pushmeet, White, Ryen, and Krause, Andreas. Information gathering in networks via active exploration . In International Joint Conferences on Artificial Intelligence, 2015

work page 2015

[28] [28]

Spectral bandits for smooth graph functions

Valko, Michal, Munos, R \' e mi, Kveton, Branislav, and Koc \' a k, Tom \' a s . Spectral bandits for smooth graph functions . In International Conference on Machine Learning, 2014

work page 2014

[29] [29]

Vaswani, Sharan and Lakshmanan, Laks. V. S. Influence maximization with bandits . Technical report, http://arxiv.org/abs/1503.00024, 2015

work page arXiv 2015

[30] [30]

On the evolution of user interaction in facebook

Viswanath, Bimal, Mislove, Alan, Cha, Meeyoung, and Gummadi, Krishna P. On the evolution of user interaction in facebook . In ACM Workshop on Online Social Networks, 2009

work page 2009

[31] [31]

Online learning with Gaussian payoffs and side observations

Wu, Yifan, Gy \" o rgy, Andr \' a s, and Szepesv \' a ri, Csaba. Online learning with Gaussian payoffs and side observations . In Neural Information Processing Systems, 2015

work page 2015

[32] [32]

Unimodal bandits

Yu, Jia Yuan and Mannor, Shie. Unimodal bandits . In International Conference on Machine Learning, 2011

work page 2011