pith. sign in

arxiv: 2606.18392 · v1 · pith:VVY6U7T2new · submitted 2026-06-16 · 💻 cs.GT

When Mobile Crowdsourcing Meets Queueing Systems: Human-in-the-Loop Learning

Pith reviewed 2026-06-26 21:40 UTC · model grok-4.3

classification 💻 cs.GT
keywords human-in-the-loop learningqueueing systemsmobile crowdsourcingprice of anarchyside paymentsexploration-exploitationcongestion managementinformation acquisition
0
0 comments X

The pith

Dynamic side payments coordinate crowdsourced queue choices to bound price of anarchy below 2 with exact budget balance.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper studies human-in-the-loop learning in queueing systems where customers both use and generate congestion reports via crowdsourcing. Myopic selfish choices by customers can drive the price of anarchy to infinity because they avoid exploring uncertain servers even when their data would help others. Existing non-monetary mechanisms for balancing exploration and exploitation fail here since customer decisions reshape the queues themselves. The authors introduce a dynamic side-payment mechanism that periodically charges some users and pays others to discourage excessive exploration, coordinate information gathering across heterogeneous servers, and guarantee a price of anarchy below 2 while preserving ex-post budget balance. Experiments on real datasets show the mechanism also performs well on average.

Core claim

Myopic decentralized server choices in endogenous queueing systems can produce infinite price of anarchy; a dynamic side-payment mechanism that charges and rewards customers periodically coordinates congestion management and information acquisition across heterogeneous servers, guarantees PoA below 2, and maintains ex-post budget balance.

What carries the argument

dynamic side-payment mechanism that periodically charges some customers and rewards others

If this is right

  • Myopic choices induce infinite price of anarchy by over-exploring likely congested servers.
  • In the single-server case the lower bound on price of anarchy falls as buffer size increases.
  • In the multi-server case the upper bound on price of anarchy falls as the number of servers grows.
  • The mechanism coordinates congestion management and information acquisition while keeping ex-post budget balance.
  • Real-dataset experiments show strong average-case performance in addition to the worst-case PoA bound.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same side-payment structure might stabilize other crowdsourced systems in which individual reports affect shared state, such as traffic apps or shared-bike availability.
  • Extending the mechanism to allow limited forward-looking customer behavior could further reduce the required payment magnitudes.
  • Testing the payments in a live mobile app would reveal whether users actually respond to the small, repeated transfers as predicted.

Load-bearing premise

Customers choose servers to maximize only their immediate service utility and ignore the value their observations create for future customers.

What would settle it

A deployment or simulation in which the side-payment mechanism is used yet the realized price of anarchy exceeds 2 or the budget fails to balance ex post.

Figures

Figures reproduced from arXiv: 2606.18392 by Hongbo Li, Lingjie Duan, Ness B. Shroff.

Figure 1
Figure 1. Figure 1: Customers sequentially arrive to request service between a variable [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Data samples and the corresponding fitted normal distributions of [PITH_FULL_IMAGE:figures/full_fig_p009_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Average inefficiency ratios under the myopic policy, information [PITH_FULL_IMAGE:figures/full_fig_p010_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Average inefficiency ratios plotted against buffer size [PITH_FULL_IMAGE:figures/full_fig_p010_4.png] view at source ↗
read the original abstract

In service systems, customers now rely on congestion information before deciding which queue or server to join, from restaurants and theme-park attractions to road networks. We study this setting as human-in-the-loop learning (HILL), where customers both consume and generate time-sensitive congestion information through crowdsourcing platforms. Because congestion reports become stale, efficient system operation requires continued exploration of servers whose current states are uncertain. Yet selfish customers avoid such exploration when it reduces their immediate service utility, even though their observations would benefit future customers. We analyze this tension between individual incentives and system-wide learning in queueing systems with endogenous congestion. We first show that myopic server choices can induce an infinite price of anarchy (PoA): decentralized customers may cause arbitrarily large efficiency losses by overexploring servers that are likely congested. In the single-server case, we prove that the lower bound on PoA decreases as buffer size grows, while in the multi-server case the upper bound decreases as the number of servers increases. We further show that existing informational, non-monetary mechanisms for exploration-exploitation with exogenous information fail in our setting, as customers' choices directly reshape the queue states and still lead to infinite PoA. To address this challenge, we design a dynamic side-payment mechanism that periodically charges some customers and rewards others, discouraging excessive exploration while maintaining ex-post budget balance. The mechanism coordinates congestion management and information acquisition across heterogeneous servers, and guarantees PoA below 2. Beyond worst-case analysis, experiments using real datasets demonstrate that the proposed mechanism also achieves strong average-case performance.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 2 minor

Summary. The paper studies human-in-the-loop learning in queueing systems where myopic, selfish customers both consume and generate congestion information via crowdsourcing. It shows that myopic server choices yield infinite price of anarchy (PoA), that prior non-monetary mechanisms for exploration-exploitation also yield infinite PoA in this endogenous setting, and that a proposed dynamic side-payment mechanism restores a PoA bound below 2 while preserving ex-post budget balance. The analysis further establishes that the lower bound on PoA decreases with buffer size in the single-server case and the upper bound decreases with the number of servers in the multi-server case. Real-dataset experiments are reported to show strong average-case performance.

Significance. If the PoA < 2 guarantee and ex-post budget balance hold under the stated model, the result would be significant for mechanism design at the intersection of game theory and queueing theory. It directly addresses the exploration-exploitation tension induced by myopic incentives in systems with endogenous congestion, provides explicit dependence of PoA bounds on system parameters (buffer size, number of servers), and includes empirical validation on real data. The combination of a constant-factor PoA bound with budget balance is a notable theoretical contribution in this setting.

minor comments (2)
  1. Abstract: the description of the real-dataset experiments does not name the datasets, the performance metrics, or the baselines used; adding these details would improve clarity without affecting the central claims.
  2. Abstract: the model of customer utilities, the precise definition of the dynamic side-payment rule, and the statement of the PoA theorem are not shown; while these belong in the body, a brief parenthetical reference to the relevant theorem number in the abstract would help readers locate the formal result.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive assessment of our work on human-in-the-loop learning in queueing systems and for recommending minor revision. No major comments were provided in the report.

Circularity Check

0 steps flagged

No significant circularity identified

full rationale

The paper's derivation proceeds via standard theoretical steps: establishing infinite PoA under myopic selfish choices, deriving parameter-dependent bounds on PoA for single- and multi-server cases, showing failure of prior non-monetary mechanisms, and constructing a dynamic side-payment mechanism whose PoA<2 and ex-post budget-balance properties are stated as theorem-level results. No step reduces a claimed prediction or bound to a fitted parameter, self-citation, or definitional equivalence; the argument chain is self-contained against external queueing and mechanism-design benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claims rest on standard assumptions about rational myopic customers and queueing dynamics; no free parameters, invented entities, or ad-hoc axioms are explicitly introduced in the abstract.

axioms (2)
  • domain assumption Customers are myopic and selfish, maximizing only immediate service utility.
    Stated as the source of the exploration-exploitation tension in the abstract.
  • domain assumption Congestion reports become stale over time.
    Invoked to justify continued exploration need.

pith-pipeline@v0.9.1-grok · 5817 in / 1312 out tokens · 18468 ms · 2026-06-26T21:40:16.230173+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

47 extracted references

  1. [1]

    Asymptotically optimal downlink scheduling over markovian fading channels,

    W. Ouyang, A. Eryilmaz, and N. B. Shroff, “Asymptotically optimal downlink scheduling over markovian fading channels,” in2012 Proceed- ings IEEE INFOCOM. IEEE, 2012, pp. 1224–1232

  2. [2]

    On queue-length information when customers travel to a queue,

    R. Hassin and R. Roet-Green, “On queue-length information when customers travel to a queue,”Manufacturing & Service Operations Management, 2020

  3. [3]

    Efficient inaccuracy: User-generated information sharing in a queue,

    J. Wang and M. Hu, “Efficient inaccuracy: User-generated information sharing in a queue,”Management Science, vol. 66, no. 10, pp. 4648– 4666, 2020

  4. [4]

    Food delivery service and restaurant: Friend or foe?

    M. Chen, M. Hu, and J. Wang, “Food delivery service and restaurant: Friend or foe?”Management Science, 2022

  5. [5]

    There is now an app to help you skip restaurant queues,

    D. Phelan, “There is now an app to help you skip restaurant queues,” https://www.timeout.com/london/blog/theres-now-an-app-to-h elp-you-skip-restaurant-queues-012717, 2017

  6. [6]

    Wait times for Disneyland app review: View user-submitted waiting times for Disneyland 2021,

    AppPicker, “Wait times for Disneyland app review: View user-submitted waiting times for Disneyland 2021,” https://www.apppicker.com/review s/10509, 2021

  7. [7]

    Transforming the holiday park experience with smart park- ing,

    Pavemint, “Transforming the holiday park experience with smart park- ing,” https://www.pavemint.com/blog/holiday-park-smart-parking, 2022

  8. [8]

    Distributed trip selection game for Public Bike System with crowdsourcing,

    J. Zhang, P. Lu, Z. Li, and J. Gan, “Distributed trip selection game for Public Bike System with crowdsourcing,” inIEEE INFOCOM 2018-IEEE Conference on Computer Communications. IEEE, 2018, pp. 2717–2725

  9. [9]

    Waze: App that combines Crowdsourcing with GPS naviga- tion,

    N. Epson, “Waze: App that combines Crowdsourcing with GPS naviga- tion,” https://www.bluelabellabs.com/blog/waze-an-app-that-combines-c rowdsourcing-with-gps-navigation, 2021

  10. [10]

    MyTSA Mobile Application,

    MyTSA, “MyTSA Mobile Application,” https://www.dhs.gov/publicat ion/dhstsapia-028-mytsa-mobile-application, 2017

  11. [11]

    Introduction to multi-armed bandits,

    A. Slivkins, “Introduction to multi-armed bandits,”Foundations and Trends® in Machine Learning, vol. 12, no. 1-2, pp. 1–286, 2019

  12. [12]

    Nonstationary bandit learning via pre- dictive sampling,

    Y . Liu, B. Van Roy, and K. Xu, “Nonstationary bandit learning via pre- dictive sampling,” inInternational Conference on Artificial Intelligence and Statistics. PMLR, 2023, pp. 6215–6244

  13. [13]

    Restless-ucb, an efficient and low- complexity algorithm for online restless bandits,

    S. Wang, L. Huang, and J. Lui, “Restless-ucb, an efficient and low- complexity algorithm for online restless bandits,”Advances in Neural Information Processing Systems, vol. 33, pp. 11 878–11 889, 2020

  14. [14]

    Learning un- known service rates in queues: A multiarmed bandit approach,

    S. Krishnasamy, R. Sen, R. Johari, and S. Shakkottai, “Learning un- known service rates in queues: A multiarmed bandit approach,”Opera- tions Research, vol. 69, no. 1, pp. 315–330, 2021

  15. [15]

    When congestion games meet mobile crowd- sourcing: Selective information disclosure,

    H. Li and L. Duan, “When congestion games meet mobile crowd- sourcing: Selective information disclosure,” inProceedings of the AAAI Conference on Artificial Intelligence, vol. 37, no. 5, 2023, pp. 5739–5746

  16. [16]

    Multi-armed- bandit-based spectrum scheduling algorithms in wireless networks: A survey,

    F. Li, D. Yu, H. Yang, J. Yu, K. Holger, and X. Cheng, “Multi-armed- bandit-based spectrum scheduling algorithms in wireless networks: A survey,”IEEE Wireless Communications, vol. 27, no. 1, pp. 24-30, 2020

  17. [17]

    Competitive Multi-armed Bandit Games for Resource Sharing,

    H. Li, and L. Duan, “Competitive Multi-armed Bandit Games for Resource Sharing,”IEEE Transactions on Mobile Computing, to appear, 2025

  18. [18]

    Distributed learning for dynamic congestion games,

    H. Li, and L. Duan, “Distributed learning for dynamic congestion games,”IEEE International Symposium on Information Theory (ISIT), pp. 3654–3659, 2024

  19. [19]

    Optimal control of admission to a queueing system,

    S. Stidham, “Optimal control of admission to a queueing system,”IEEE Transactions on Automatic Control, vol. 30, no. 8, pp. 705–713, 1985

  20. [20]

    M/g/1 queue with event-dependent arrival rates,

    B. Legros, “M/g/1 queue with event-dependent arrival rates,”Queueing Systems, vol. 89, pp. 269–301, 2018

  21. [21]

    Recommending paths: Follow or not follow?

    Y . Li, C. Courcoubetis, and L. Duan, “Recommending paths: Follow or not follow?” inIEEE INFOCOM 2019-IEEE Conference on Computer Communications. IEEE, 2019, pp. 928–936

  22. [22]

    Virtues of patience in strategic queuing systems

    J. Gaitonde, and E. Tardos, “Virtues of patience in strategic queuing systems” inProceedings of the 22nd ACM Conference on Economics and Computation. IEEE, 2021, pp. 520–540

  23. [23]

    Sharing delay information in service systems: a literature survey,

    R. Ibrahim, “Sharing delay information in service systems: a literature survey,” inQueueing Systems. Springer, vol. 89, no. 1, 2018, pp. 49–79

  24. [24]

    Dynamic information design: a simple problem on optimal sequential information disclosure,

    F. Farhadi and D. Teneketzis, “Dynamic information design: a simple problem on optimal sequential information disclosure,”Dynamic Games and Applications, vol. 12, no. 2, pp. 443–484, 2022

  25. [25]

    Bayesian exploration: Incentivizing exploration in bayesian games,

    Y . Mansour, A. Slivkins, V . Syrgkanis, and Z. S. Wu, “Bayesian exploration: Incentivizing exploration in bayesian games,”Operations Research, vol. 70, no. 2, pp. 1105–1127, 2022

  26. [26]

    Signaling service quality through queue disclosure,

    P. Guo, M. Haviv, Z. Luo, and Y . Wang, “Signaling service quality through queue disclosure,”Manufacturing & Service Operations Man- agement, 2022

  27. [27]

    B ¨orgers and D

    T. B ¨orgers and D. Krahmer,An introduction to the theory of mechanism design. Oxford University Press, USA, 2015

  28. [28]

    Dynamic routing for social in- formation sharing,

    Y . Li, C. A. Courcoubetis, and L. Duan, “Dynamic routing for social in- formation sharing,”IEEE Journal on Selected Areas in Communications, vol. 35, no. 3, pp. 571–585, 2017

  29. [29]

    The effectiveness of subsidies and tolls in congestion games,

    B. L. Ferguson, P. N. Brown, and J. R. Marden, “The effectiveness of subsidies and tolls in congestion games,”IEEE Transactions on Automatic Control, 2021

  30. [30]

    Spatio-temporal pricing for rideshar- ing platforms,

    H. Ma, F. Fang, and D. C. Parkes, “Spatio-temporal pricing for rideshar- ing platforms,”Operations Research, 2021

  31. [31]

    Informational incentives for congestion games,

    H. Tavafoghi and D. Teneketzis, “Informational incentives for congestion games,” in2017 55th Annual Allerton Conference on Communication, Control, and Computing (Allerton). IEEE, 2017, pp. 1285–1292

  32. [32]

    A traffic congestion analysis by user equilibrium and system optimum with incomplete information,

    Q. Zhang, S. Q. Liu, and M. Masoud, “A traffic congestion analysis by user equilibrium and system optimum with incomplete information,” Journal of Combinatorial Optimization, pp. 1–14, 2020

  33. [33]

    Optimal control of a queueing system with two heterogeneous servers,

    W. Lin and P. Kumar, “Optimal control of a queueing system with two heterogeneous servers,”IEEE Transactions on Automatic control, vol. 29, no. 8, pp. 696–703, 1984

  34. [34]

    Discrete-time queuing theory,

    T. Meisling, “Discrete-time queuing theory,”Operations Research, vol. 6, no. 1, pp. 96–105, 1958

  35. [35]

    Hassin,Rational queueing

    R. Hassin,Rational queueing. CRC press, 2016

  36. [36]

    Minimizing the age of information through queues,

    A. Bedewy, Y . Sun and N. B. Shroff, “Minimizing the age of information through queues,”IEEE Transactions on Information Theory, vol. 65, no. 8, pp. 5215–5232, 2019

  37. [37]

    Online pricing incentive to sample fresh informa- tion,

    H. Li and L. Duan, “Online pricing incentive to sample fresh informa- tion,”IEEE Transactions on Network Science and Engineering, vol. 10, no. 1, pp. 514–526, 2022

  38. [38]

    Applying a new device in the optimization of exponen- tial queuing systems,

    S. A. Lippman, “Applying a new device in the optimization of exponen- tial queuing systems,”Operations Research, vol. 23, no. 4, pp. 687–710, 1975

  39. [39]

    Dynamic programming,

    R. Bellman, “Dynamic programming,”Science, vol. 153, no. 3731, pp. 34–37, 1966

  40. [40]

    Approximate Dynamic Programming: Solving the curses of dimensionality,

    W. B. Powell, “Approximate Dynamic Programming: Solving the curses of dimensionality,”John Wiley & Sons, vol. 703, 2007

  41. [41]

    Reinforcement learning: An introduction,

    R. Sutton, and A. G. Barto, “Reinforcement learning: An introduction,” MIT press Cambridge, vol. 1, no. 1, 1998. IEEE TRANSACTIONS ON NETWORKING 12

  42. [42]

    Worst-case equilibria,

    E. Koutsoupias and C. Papadimitriou, “Worst-case equilibria,” inAnnual symposium on theoretical aspects of computer science. Springer, 1999, pp. 404–413

  43. [43]

    Individually rational, budget-balanced mechanisms and allocation of surplus,

    G. Kosenok and S. Severinov, “Individually rational, budget-balanced mechanisms and allocation of surplus,”Journal of Economic Theory, vol. 140, no. 1, pp. 126–161, 2008

  44. [44]

    The impossibility of strategy-proof, pareto efficient, and individually rational rules for fractional matching,

    S. Alva and V . Manjunath, “The impossibility of strategy-proof, pareto efficient, and individually rational rules for fractional matching,”Games and Economic Behavior, vol. 119, pp. 15–29, 2020

  45. [45]

    Talabat Dubai detailed data for Feb 2023,

    Talabat, “Talabat Dubai detailed data for Feb 2023,” https://www.kagg le.com/datasets/zivaank/data-feb-2023?resource=download, 2023

  46. [46]

    To Analyze and Regulate Human-in-the-Loop Learning for Congestion Games,

    H. Li, and L. Duan, “To Analyze and Regulate Human-in-the-Loop Learning for Congestion Games,”IEEE Transactions on Networking, to appear, 2025. Hongbo Li(M’24) received the B.Sc. degree from Shanghai Jiao Tong University (SJTU), China, in 2019, and the Ph.D. degree from Singapore Univer- sity of Technology and Design (SUTD), Singapore, in 2024. He is curr...

  47. [47]

    He currently serves as the Institute Director of the NSF AI Institute for Future Edge Networks and Distributed Intelligence (AI-EDGE)

    In 2007, he joined The Ohio State University, Columbus, OH, USA, where he holds the Ohio Eminent Scholar Endowed Chair in networking and communications, with the Departments of ECE and CSE. He currently serves as the Institute Director of the NSF AI Institute for Future Edge Networks and Distributed Intelligence (AI-EDGE). He is also the director of the n...