KISS: Keeping it Simple and Slotted when Learning to Communicate over Wireless

Kamil Szczech; Katarzyna Kosek-Szott; Krzysztof Rusek; Maksymilian Wojnar; Szymon Szott

arxiv: 2606.00266 · v1 · pith:YIFVZR6Znew · submitted 2026-05-29 · 💻 cs.NI · cs.LG

KISS: Keeping it Simple and Slotted when Learning to Communicate over Wireless

Kamil Szczech , Maksymilian Wojnar , Krzysztof Rusek , Katarzyna Kosek-Szott , Szymon Szott This is my paper

Pith reviewed 2026-06-28 19:35 UTC · model grok-4.3

classification 💻 cs.NI cs.LG

keywords machine learningwireless networksrandom accessmedium access controlDDQNslotted ALOHAdistributed learningmulti-agent reinforcement learning

0 comments

The pith

Decentralized machine learning agents learn to achieve near-optimal efficiency and fairness in wireless random channel access.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper tests whether fully distributed machine learning agents can discover efficient and fair random-access strategies over a wireless channel under minimal assumptions. Agents use an off-policy Double Deep Q-Network with Bayesian inference, training online without pre-training, coordination, or explicit communication. Simulations show the resulting policies adapt to changing network sizes and loads, reach near-theoretical throughput, and preserve fairness. Further analysis reveals the learned behavior matches slotted ALOHA with a transmission probability that adjusts dynamically to observed conditions.

Core claim

Fully online, independent DDQN agents with Bayesian inference, operating over a slotted channel without any coordination, learn access strategies that approach theoretical efficiency limits while maintaining fairness; ablation studies show this behavior reduces to slotted ALOHA with a dynamically tuned transmission probability.

What carries the argument

Off-policy Double Deep Q-Network with Bayesian inference that lets each agent estimate its own transmission probability from local observations alone.

If this is right

No pre-training, central controller, or inter-agent messages are required for the method to operate.
The learned policy automatically adjusts its transmission probability as the number of active nodes changes.
Fairness and efficiency hold across a range of network loads in the simulated environment.
The final behavior is simple enough to be described as dynamic slotted ALOHA rather than an opaque neural policy.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

If the simulation model is extended with realistic timing jitter, the same training procedure might produce a different adjustment rule.
The resemblance to slotted ALOHA suggests the learning process rediscovers a known optimum rather than inventing an entirely new mechanism.
Deployment on real radios would first require verifying that local observations remain sufficient when collisions are not perfectly observable.

Load-bearing premise

The slotted-channel simulation used for training captures all essential dynamics of real wireless random access.

What would settle it

Running the learned policy on hardware or in a simulator that includes hidden terminals and capture effects, then measuring whether throughput or fairness drops below the simulated levels.

Figures

Figures reproduced from arXiv: 2606.00266 by Kamil Szczech, Katarzyna Kosek-Szott, Krzysztof Rusek, Maksymilian Wojnar, Szymon Szott.

**Figure 1.** Figure 1: Simplified KISS operation diagram. troller, no shared parameters, and no control messages. We formulate the problem as a partially observable stochastic game (POSG) and define the environment (i.e., slotted wireless channel) states, action space, and observations (Section 3). We next design a reward function to promote efficient use of the channel, rewarding the agent for successful transmissions, penali… view at source ↗

**Figure 2.** Figure 2: Joint distribution of weight means θi and the uncertainty measured by the standard deviations σi for each of the five agents from a single training run (N = 5, saturated traffic). 0.5 0.0 0.5 1.0 1.5 Weight i 0.0 0.1 0.2 0.3 Uncertainty i Dense layers Attention Output layer Biases Layer norm y = |x| [PITH_FULL_IMAGE:figures/full_fig_p007_2.png] view at source ↗

**Figure 5.** Figure 5: Performance for the steady-state case. 5 Performance Analysis We evaluate KISS against the defined baselines in saturated networks, considering both instantaneous behavior for a 15-station system and steady-state scalability across different network sizes. We then analyze performance under non-saturated low- and medium-load traffic profiles, followed by a dynamic scenario where stations join or leave the n… view at source ↗

**Figure 6.** Figure 6: Comparison of the transmission probability of KISS agents versus ideal slotted ALOHA ( [PITH_FULL_IMAGE:figures/full_fig_p010_6.png] view at source ↗

**Figure 7.** Figure 7: Steady-state case under low traffic. 0.00 0.05 0.10 0.15 0.20 0.25 Aggregate throughput 5 10 15 20 25 30 35 40 Number of stations 0.0 0.2 0.4 0.6 0.8 1.0 Jain's fairness index EB-ALOHA Ideal ALOHA-Q Fixed ALOHA-Q DLMA KISS [PITH_FULL_IMAGE:figures/full_fig_p011_7.png] view at source ↗

**Figure 9.** Figure 9: ALOHA-Q slot switching rate and ratio of failed transmissions vs no. of stations. [PITH_FULL_IMAGE:figures/full_fig_p011_9.png] view at source ↗

**Figure 10.** Figure 10: Performance for the dynamic case. The aggregate throughput of ALOHA-Q is heavily dependent on the number of active stations. When the number of stations decreases (increases), the throughput increases (drops). As in the steady-state case, ALOHA-Q performs best when the number of stations is close to the number of available time slots. As a result, ALOHA-Q requires continuous monitoring of the number of us… view at source ↗

**Figure 11.** Figure 11: Ablation of the observation history length [PITH_FULL_IMAGE:figures/full_fig_p013_11.png] view at source ↗

**Figure 13.** Figure 13: Ablation of the delay penalties (Sidle, D). Removing them allows for channel capture, peaking throughput at ≈ 0.85 but destroying fairness. 5 10 15 20 25 30 35 40 Number of stations 0.0 0.2 0.4 0.6 0.8 1.0 Aggregate throughput Jain's fairness index Metrics Throughput Fairness Ablation Empty buffer reward No empty buffer reward [PITH_FULL_IMAGE:figures/full_fig_p013_13.png] view at source ↗

**Figure 15.** Figure 15: Ablation of listen-before-talk. LBT signif [PITH_FULL_IMAGE:figures/full_fig_p014_15.png] view at source ↗

read the original abstract

A long-standing challenge in distributed wireless systems is ensuring efficient and fair random channel access. Existing solutions often address specific constraints related to timing, periodicity, or centralization, but they typically rely on fixed heuristics. Motivated by recent advances in machine learning (ML), we investigate whether ML agents can autonomously learn efficient and fair access strategies, and whether such learning can offer new insights into medium access control (MAC) design. Rather than proposing a deployable protocol, our aim is to examine whether decentralized learning can rediscover or approximate theoretically efficient random-access mechanisms under minimal assumptions. To this end, we deploy an off-policy Double Deep Q-Network (DDQN) with Bayesian inference to train agents operating over a slotted channel. The resulting method is fully online (no pre-training), fully distributed (independent multi-agent learners), stochastic (non-periodic), and requires no coordination or explicit communication. Extensive simulations show that the learned strategy adapts to varying network conditions and achieves near-theoretical efficiency while maintaining fairness. Ablation studies further reveal that the learned behavior resembles slotted ALOHA with a dynamically adjusted transmission probability, leading us to refer to the method as KISS: Keeping It Simple and Slotted.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Off-the-shelf RL rediscovers slotted ALOHA in distributed wireless simulations.

read the letter

The main point from this paper is that when you train independent DDQN agents on a slotted wireless channel with no coordination, the resulting access policy ends up matching slotted ALOHA with a transmission probability that changes based on conditions.

The authors do a good job of framing the work narrowly. They are explicit that the goal is to test whether learning can approximate known efficient mechanisms rather than to build something new for deployment. That keeps the paper focused. The use of an off-policy DDQN with Bayesian inference for the multi-agent setting is a reasonable choice, and the fact that everything runs fully online and distributed is a plus for the claim of minimal assumptions. Spotting the ALOHA resemblance after training and backing it with ablations is the part that adds a bit of value beyond just reporting that the agents perform well.

On the performance side, the simulations are said to show adaptation to varying network conditions with efficiency close to theoretical limits and maintained fairness. This aligns with what one would hope for from a method that rediscovers a solid baseline.

The limitations are standard for this type of work. All results come from simulation in an idealized slotted model that assumes perfect synchronization and no hidden terminals or capture effects. The abstract gives summary-level outcomes without specific numbers or variance details, so the strength of the efficiency and fairness claims is hard to assess precisely from the high-level description. The Bayesian component is mentioned but its contribution is not broken out in the provided summary.

This kind of paper is aimed at researchers in machine learning for communications or in wireless protocol design who are curious about what RL produces in classic settings. It is not revolutionary, but it is a solid incremental check on whether learning can recover known good strategies. The thinking is clear and the positioning avoids overreach, so it should go through peer review.

Referee Report

0 major / 3 minor

Summary. The manuscript investigates whether decentralized machine learning agents can learn efficient and fair random channel access strategies in a minimal slotted wireless channel model. Using an off-policy Double Deep Q-Network (DDQN) augmented with Bayesian inference, the agents operate fully online and distributed without pre-training, coordination, or explicit communication. Simulations indicate that the learned policy adapts to varying network conditions, achieves near-theoretical efficiency while preserving fairness, and observationally resembles slotted ALOHA with a dynamically adjusted transmission probability; the method is positioned as an exploratory tool rather than a deployable protocol.

Significance. If the simulation outcomes hold under detailed scrutiny, the work offers a concrete demonstration that decentralized learning can rediscover or approximate known efficient mechanisms (slotted ALOHA) from minimal assumptions, providing methodological insight into MAC design. Strengths include the fully online/distributed/stochastic formulation and the ablation study linking learned behavior to a classical protocol; these elements are explicitly credited as advancing understanding of what learning can achieve without fixed heuristics.

minor comments (3)

The abstract and results sections present simulation outcomes only at summary level (e.g., 'near-theoretical efficiency' and 'maintaining fairness') without reporting exact efficiency numbers, baselines, fairness metrics, error bars, or ablation details on the Bayesian component; adding these would allow verification of the central empirical claim.
The slotted-channel simulation model is described as minimal; a brief discussion of how unmodeled effects (hidden terminals, capture, timing jitter) were considered or excluded would strengthen the scope statement without altering the paper's positioning.
Notation for the DDQN update rule and Bayesian inference integration should be clarified with explicit equations or pseudocode to support reproducibility of the 'fully online' training procedure.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive summary of our manuscript and the recommendation for minor revision. The referee's description accurately reflects the scope, methodology, and positioning of the work as an exploratory tool rather than a deployable protocol. No specific major comments were raised in the report.

Circularity Check

0 steps flagged

No significant circularity

full rationale

The paper's central result is produced by training DDQN agents in a slotted-channel simulation and observing post-training behavior via ablation studies. No load-bearing derivation, equation, or fitted parameter is presented that reduces to its own inputs by construction. The resemblance to slotted ALOHA is reported as an empirical finding after training rather than an assumed or fitted input. No self-citation chain, uniqueness theorem, or ansatz smuggling is invoked to support the main claim. The work is scoped to simulation outcomes under minimal assumptions and remains self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The approach rests on standard assumptions of RL convergence in multi-agent settings and on the fidelity of an idealized slotted collision channel; no new entities are postulated.

axioms (1)

domain assumption The wireless medium is perfectly slotted with collisions as the only failure mode and no capture, fading, or timing errors.
This modeling choice is required for the agents to learn a pure transmission-probability policy without additional state variables.

pith-pipeline@v0.9.1-grok · 5760 in / 1211 out tokens · 26097 ms · 2026-06-28T19:35:04.744333+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

25 extracted references · 11 canonical work pages

[1]

The aloha system: Another alternative for computer communications

Norman Abramson. The aloha system: Another alternative for computer communications. InProceedings of the November 17-19, 1970, fall joint computer conference, pages 281–285, 1970

1970
[2]

A survey on cooperative mac protocols in ieee 802.11 wireless networks.Wireless Personal Communications, 95(2):1469–1493, 2017

Rasool Sadeghi, João Paulo Barraca, and Rui L Aguiar. A survey on cooperative mac protocols in ieee 802.11 wireless networks.Wireless Personal Communications, 95(2):1469–1493, 2017

2017
[3]

Al Rabee, and Richard D

Eren Balevi, Faeik T. Al Rabee, and Richard D. Gitlin. Aloha-noma for massive machine-to- machine iot communication. volume 2018-May, 2018. doi:10.1109/ICC.2018.8422892. URL https://www.scopus.com/inward/record.uri?eid=2-s2.0-85048454170&doi=10.1109%2fICC. 2018.8422892&partnerID=40&md5=673539387e3b4404fa2ae5256111da43. All Open Access, Green Open Access

work page doi:10.1109/icc.2018.8422892 2018
[4]

Mohamed Elkourdi, Asim Mazin, Eren Balevi, and Richard D. Gitlin. Enabling slotted aloha-noma for massive m2m communication in iot networks. page 1 – 4, 2018. doi:10.1109/W AMICON.2018.8363906. URLhttps://www.scopus.com/inward/record.uri?eid=2-s2.0-85048394208&doi=10.1109% 2fWAMICON.2018.8363906&partnerID=40&md5=e7d11870b151ff172b0cf37cf2b0af81

work page doi:10.1109/w 2018
[5]

Aloha with sic-aided collision resolution.IEEE Internet of Things Journal, 12(8):10194 – 10209, 2025

Jun-Bae Seo, Yangqian Hu, Hu Jin, and Swades De. Aloha with sic-aided collision resolution.IEEE Internet of Things Journal, 12(8):10194 – 10209, 2025. doi:10.1109/JIOT.2024.3510461. URL https://www.scopus.com/inward/record.uri?eid=2-s2.0-105002562422&doi=10.1109%2fJIOT. 2024.3510461&partnerID=40&md5=e32c033c4caa1a51f4373c5d7e541967

work page doi:10.1109/jiot.2024.3510461 2025
[6]

Analytical modeling of slotted aloha-based direct-to-satellite-iot sensor networks over nakagami- m fading channels.IEEE Sensors Journal, 26(2):3264 – 3277, 2026

Vignon Fidele Adanvo, Samuel Mafra, Samuel Montejo-Sanchez, Felipe Augusto Tondo, and Richard Demo Souza. Analytical modeling of slotted aloha-based direct-to-satellite-iot sensor networks over nakagami- m fading channels.IEEE Sensors Journal, 26(2):3264 – 3277, 2026. doi:10.1109/JSEN.2025.3635197. URL https://www.scopus.com/inward/record.uri?eid=2-s2.0-1...

work page doi:10.1109/jsen.2025.3635197 2026
[7]

Closeness centrality- based scheduling for iot transmissions in leo satellite networks

Felipe Augusto Tondo, Samuel Montejo Sanchez, and Richard Demo Souza. Closeness centrality- based scheduling for iot transmissions in leo satellite networks. page 335 – 338, 2025. doi:10.1109/LCIoT64881.2025.11118577. URLhttps://www.scopus.com/inward/record.uri? eid=2-s2.0-105016379310&doi=10.1109%2fLCIoT64881.2025.11118577&partnerID=40&md5= f06798925d1d1...

work page doi:10.1109/lciot64881.2025.11118577 2025
[8]

Abramson

N. Abramson. The throughput of packet broadcasting channels.IEEE Transactions on Communications, 25(1): 117–128, 1977. doi:10.1109/TCOM.1977.1093713

work page doi:10.1109/tcom.1977.1093713 1977
[9]

Adaptive mechanism for distributed opportunistic scheduling.IEEE Transactions on Wireless Communications, 14(6):3494–3508, 2015

Andres Garcia-Saavedra, Albert Banchs, Pablo Serrano, and Joerg Widmer. Adaptive mechanism for distributed opportunistic scheduling.IEEE Transactions on Wireless Communications, 14(6):3494–3508, 2015

2015
[10]

Deep reinforcement learning with double q-learning

Hado van Hasselt, Arthur Guez, and David Silver. Deep reinforcement learning with double q-learning. In Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, AAAI’16, pages 2094–2100. AAAI Press, 2016

2094
[11]

Mitchell, and David Grace

Yi Chu, Paul D. Mitchell, and David Grace. Aloha and q-learning based medium access control for wireless sensor networks. In2012 International Symposium on Wireless Communication Systems (ISWCS), pages 511– 515, 2012. doi:10.1109/ISWCS.2012.6328420

work page doi:10.1109/iswcs.2012.6328420 2012
[12]

Deep-reinforcement learning multiple access for heteroge- neous wireless networks.IEEE Journal on Selected Areas in Communications, 37(6):1277–1290, 2019

Yiding Yu, Taotao Wang, and Soung Chang Liew. Deep-reinforcement learning multiple access for heteroge- neous wireless networks.IEEE Journal on Selected Areas in Communications, 37(6):1277–1290, 2019

2019
[13]

Mitchell, and David Grace

Sung Hyun Park, Paul D. Mitchell, and David Grace. Reinforcement learning based mac protocol (aloha-q). IEEE Access, 2019. doi:10.1109/ACCESS.2019.2953801

work page doi:10.1109/access.2019.2953801 2019
[14]

Dr-aloha-q: A q-learning-based adaptive mac protocol for underwater acoustic sensor networks.Sensors, 23(9):4474, 2023

Slavica Tomovic and Igor Radusinovic. Dr-aloha-q: A q-learning-based adaptive mac protocol for underwater acoustic sensor networks.Sensors, 23(9):4474, 2023. 15 KISS: Keeping it Simple and Slotted

2023
[15]

Molly Zhang, Luca de Alfaro, and J. J. Garcia-Luna-Aceves. Making slotted aloha efficient and fair using reinforcement learning.Computer Communications, 181:58–68, 2022

2022
[16]

Towards multi-agent reinforcement learning for wireless network protocol synthesis

Hrishikesh Dutta and Subir Biswas. Towards multi-agent reinforcement learning for wireless network protocol synthesis. InProc. IEEE COMSNETS, 2021

2021
[17]

Distributed reinforcement learning for scalable wireless medium access in iots and sensor networks.Computer Networks, 202:108662, 2022

Hrishikesh Dutta and Subir Biswas. Distributed reinforcement learning for scalable wireless medium access in iots and sensor networks.Computer Networks, 202:108662, 2022

2022
[18]

Multi-agent reinforcement learning-based distributed channel access for next generation wireless networks.IEEE Journal on Selected Areas in Communications, 40(5):1587–1599, 2022

Ziyang Guo, Zhenyu Chen, Peng Liu, Jianjun Luo, Xun Yang, and Xinghua Sun. Multi-agent reinforcement learning-based distributed channel access for next generation wireless networks.IEEE Journal on Selected Areas in Communications, 40(5):1587–1599, 2022

2022
[19]

Scalable multi-agent reinforcement learning-based distributed channel access

Zhenyu Chen and Xinghua Sun. Scalable multi-agent reinforcement learning-based distributed channel access. InICC 2023-IEEE International Conference on Communications, pages 453–458. IEEE, IEEE, 2023

2023
[20]

Online multi-agent rein- forcement learning for multiple access in wireless networks.IEEE Communications Letters, 27(12):3250–3254, 2023

Jianbin Xiao, Zhenyu Chen, Xinghua Sun, Wen Zhan, Xijun Wang, and Xiang Chen. Online multi-agent rein- forcement learning for multiple access in wireless networks.IEEE Communications Letters, 27(12):3250–3254, 2023

2023
[21]

Multi-task reinforcement learning-based multiple access for dynamic wireless networks.IEEE Transactions on Mobile Computing, 24(9):9153–9167, 2025

Zhenyu Chen, Xinghua Sun, Yili Jin, and Fangxin Wang. Multi-task reinforcement learning-based multiple access for dynamic wireless networks.IEEE Transactions on Mobile Computing, 24(9):9153–9167, 2025. doi:10.1109/TMC.2025.3559676

work page doi:10.1109/tmc.2025.3559676 2025
[22]

Foundation model enhanced multiple ac- cess in heterogeneous networks.IEEE Transactions on Mobile Computing, 24(9):8974–8987, 2025

Mingqi Han, Xinghua Sun, Xijun Wang, and Xiang Chen. Foundation model enhanced multiple ac- cess in heterogeneous networks.IEEE Transactions on Mobile Computing, 24(9):8974–8987, 2025. doi:10.1109/TMC.2025.3558942

work page doi:10.1109/tmc.2025.3558942 2025
[23]

Application of reinforcement learn- ing to medium access control for wireless sensor networks.Engineering Applications of Artificial Intelligence, 46:23–32, 2015

Yi Chu, Selahattin Kosunalp, Paul D Mitchell, David Grace, and Tim Clarke. Application of reinforcement learn- ing to medium access control for wireless sensor networks.Engineering Applications of Artificial Intelligence, 46:23–32, 2015

2015
[24]

Attention is all you need.Advances in neural information processing systems, 30, 2017

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. Attention is all you need.Advances in neural information processing systems, 30, 2017

2017
[25]

Hands-on bayesian neural networks—a tutorial for deep learning users.IEEE Computational Intelligence Magazine, 17 (2):29–48, 2022

Laurent Valentin Jospin, Hamid Laga, Farid Boussaid, Wray Buntine, and Mohammed Bennamoun. Hands-on bayesian neural networks—a tutorial for deep learning users.IEEE Computational Intelligence Magazine, 17 (2):29–48, 2022. doi:10.1109/MCI.2022.3155327. 16

work page doi:10.1109/mci.2022.3155327 2022

[1] [1]

The aloha system: Another alternative for computer communications

Norman Abramson. The aloha system: Another alternative for computer communications. InProceedings of the November 17-19, 1970, fall joint computer conference, pages 281–285, 1970

1970

[2] [2]

A survey on cooperative mac protocols in ieee 802.11 wireless networks.Wireless Personal Communications, 95(2):1469–1493, 2017

Rasool Sadeghi, João Paulo Barraca, and Rui L Aguiar. A survey on cooperative mac protocols in ieee 802.11 wireless networks.Wireless Personal Communications, 95(2):1469–1493, 2017

2017

[3] [3]

Al Rabee, and Richard D

Eren Balevi, Faeik T. Al Rabee, and Richard D. Gitlin. Aloha-noma for massive machine-to- machine iot communication. volume 2018-May, 2018. doi:10.1109/ICC.2018.8422892. URL https://www.scopus.com/inward/record.uri?eid=2-s2.0-85048454170&doi=10.1109%2fICC. 2018.8422892&partnerID=40&md5=673539387e3b4404fa2ae5256111da43. All Open Access, Green Open Access

work page doi:10.1109/icc.2018.8422892 2018

[4] [4]

Mohamed Elkourdi, Asim Mazin, Eren Balevi, and Richard D. Gitlin. Enabling slotted aloha-noma for massive m2m communication in iot networks. page 1 – 4, 2018. doi:10.1109/W AMICON.2018.8363906. URLhttps://www.scopus.com/inward/record.uri?eid=2-s2.0-85048394208&doi=10.1109% 2fWAMICON.2018.8363906&partnerID=40&md5=e7d11870b151ff172b0cf37cf2b0af81

work page doi:10.1109/w 2018

[5] [5]

Aloha with sic-aided collision resolution.IEEE Internet of Things Journal, 12(8):10194 – 10209, 2025

Jun-Bae Seo, Yangqian Hu, Hu Jin, and Swades De. Aloha with sic-aided collision resolution.IEEE Internet of Things Journal, 12(8):10194 – 10209, 2025. doi:10.1109/JIOT.2024.3510461. URL https://www.scopus.com/inward/record.uri?eid=2-s2.0-105002562422&doi=10.1109%2fJIOT. 2024.3510461&partnerID=40&md5=e32c033c4caa1a51f4373c5d7e541967

work page doi:10.1109/jiot.2024.3510461 2025

[6] [6]

Analytical modeling of slotted aloha-based direct-to-satellite-iot sensor networks over nakagami- m fading channels.IEEE Sensors Journal, 26(2):3264 – 3277, 2026

Vignon Fidele Adanvo, Samuel Mafra, Samuel Montejo-Sanchez, Felipe Augusto Tondo, and Richard Demo Souza. Analytical modeling of slotted aloha-based direct-to-satellite-iot sensor networks over nakagami- m fading channels.IEEE Sensors Journal, 26(2):3264 – 3277, 2026. doi:10.1109/JSEN.2025.3635197. URL https://www.scopus.com/inward/record.uri?eid=2-s2.0-1...

work page doi:10.1109/jsen.2025.3635197 2026

[7] [7]

Closeness centrality- based scheduling for iot transmissions in leo satellite networks

Felipe Augusto Tondo, Samuel Montejo Sanchez, and Richard Demo Souza. Closeness centrality- based scheduling for iot transmissions in leo satellite networks. page 335 – 338, 2025. doi:10.1109/LCIoT64881.2025.11118577. URLhttps://www.scopus.com/inward/record.uri? eid=2-s2.0-105016379310&doi=10.1109%2fLCIoT64881.2025.11118577&partnerID=40&md5= f06798925d1d1...

work page doi:10.1109/lciot64881.2025.11118577 2025

[8] [8]

Abramson

N. Abramson. The throughput of packet broadcasting channels.IEEE Transactions on Communications, 25(1): 117–128, 1977. doi:10.1109/TCOM.1977.1093713

work page doi:10.1109/tcom.1977.1093713 1977

[9] [9]

Adaptive mechanism for distributed opportunistic scheduling.IEEE Transactions on Wireless Communications, 14(6):3494–3508, 2015

Andres Garcia-Saavedra, Albert Banchs, Pablo Serrano, and Joerg Widmer. Adaptive mechanism for distributed opportunistic scheduling.IEEE Transactions on Wireless Communications, 14(6):3494–3508, 2015

2015

[10] [10]

Deep reinforcement learning with double q-learning

Hado van Hasselt, Arthur Guez, and David Silver. Deep reinforcement learning with double q-learning. In Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, AAAI’16, pages 2094–2100. AAAI Press, 2016

2094

[11] [11]

Mitchell, and David Grace

Yi Chu, Paul D. Mitchell, and David Grace. Aloha and q-learning based medium access control for wireless sensor networks. In2012 International Symposium on Wireless Communication Systems (ISWCS), pages 511– 515, 2012. doi:10.1109/ISWCS.2012.6328420

work page doi:10.1109/iswcs.2012.6328420 2012

[12] [12]

Deep-reinforcement learning multiple access for heteroge- neous wireless networks.IEEE Journal on Selected Areas in Communications, 37(6):1277–1290, 2019

Yiding Yu, Taotao Wang, and Soung Chang Liew. Deep-reinforcement learning multiple access for heteroge- neous wireless networks.IEEE Journal on Selected Areas in Communications, 37(6):1277–1290, 2019

2019

[13] [13]

Mitchell, and David Grace

Sung Hyun Park, Paul D. Mitchell, and David Grace. Reinforcement learning based mac protocol (aloha-q). IEEE Access, 2019. doi:10.1109/ACCESS.2019.2953801

work page doi:10.1109/access.2019.2953801 2019

[14] [14]

Dr-aloha-q: A q-learning-based adaptive mac protocol for underwater acoustic sensor networks.Sensors, 23(9):4474, 2023

Slavica Tomovic and Igor Radusinovic. Dr-aloha-q: A q-learning-based adaptive mac protocol for underwater acoustic sensor networks.Sensors, 23(9):4474, 2023. 15 KISS: Keeping it Simple and Slotted

2023

[15] [15]

Molly Zhang, Luca de Alfaro, and J. J. Garcia-Luna-Aceves. Making slotted aloha efficient and fair using reinforcement learning.Computer Communications, 181:58–68, 2022

2022

[16] [16]

Towards multi-agent reinforcement learning for wireless network protocol synthesis

Hrishikesh Dutta and Subir Biswas. Towards multi-agent reinforcement learning for wireless network protocol synthesis. InProc. IEEE COMSNETS, 2021

2021

[17] [17]

Distributed reinforcement learning for scalable wireless medium access in iots and sensor networks.Computer Networks, 202:108662, 2022

Hrishikesh Dutta and Subir Biswas. Distributed reinforcement learning for scalable wireless medium access in iots and sensor networks.Computer Networks, 202:108662, 2022

2022

[18] [18]

Multi-agent reinforcement learning-based distributed channel access for next generation wireless networks.IEEE Journal on Selected Areas in Communications, 40(5):1587–1599, 2022

Ziyang Guo, Zhenyu Chen, Peng Liu, Jianjun Luo, Xun Yang, and Xinghua Sun. Multi-agent reinforcement learning-based distributed channel access for next generation wireless networks.IEEE Journal on Selected Areas in Communications, 40(5):1587–1599, 2022

2022

[19] [19]

Scalable multi-agent reinforcement learning-based distributed channel access

Zhenyu Chen and Xinghua Sun. Scalable multi-agent reinforcement learning-based distributed channel access. InICC 2023-IEEE International Conference on Communications, pages 453–458. IEEE, IEEE, 2023

2023

[20] [20]

Online multi-agent rein- forcement learning for multiple access in wireless networks.IEEE Communications Letters, 27(12):3250–3254, 2023

Jianbin Xiao, Zhenyu Chen, Xinghua Sun, Wen Zhan, Xijun Wang, and Xiang Chen. Online multi-agent rein- forcement learning for multiple access in wireless networks.IEEE Communications Letters, 27(12):3250–3254, 2023

2023

[21] [21]

Multi-task reinforcement learning-based multiple access for dynamic wireless networks.IEEE Transactions on Mobile Computing, 24(9):9153–9167, 2025

Zhenyu Chen, Xinghua Sun, Yili Jin, and Fangxin Wang. Multi-task reinforcement learning-based multiple access for dynamic wireless networks.IEEE Transactions on Mobile Computing, 24(9):9153–9167, 2025. doi:10.1109/TMC.2025.3559676

work page doi:10.1109/tmc.2025.3559676 2025

[22] [22]

Foundation model enhanced multiple ac- cess in heterogeneous networks.IEEE Transactions on Mobile Computing, 24(9):8974–8987, 2025

Mingqi Han, Xinghua Sun, Xijun Wang, and Xiang Chen. Foundation model enhanced multiple ac- cess in heterogeneous networks.IEEE Transactions on Mobile Computing, 24(9):8974–8987, 2025. doi:10.1109/TMC.2025.3558942

work page doi:10.1109/tmc.2025.3558942 2025

[23] [23]

Application of reinforcement learn- ing to medium access control for wireless sensor networks.Engineering Applications of Artificial Intelligence, 46:23–32, 2015

Yi Chu, Selahattin Kosunalp, Paul D Mitchell, David Grace, and Tim Clarke. Application of reinforcement learn- ing to medium access control for wireless sensor networks.Engineering Applications of Artificial Intelligence, 46:23–32, 2015

2015

[24] [24]

Attention is all you need.Advances in neural information processing systems, 30, 2017

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. Attention is all you need.Advances in neural information processing systems, 30, 2017

2017

[25] [25]

Hands-on bayesian neural networks—a tutorial for deep learning users.IEEE Computational Intelligence Magazine, 17 (2):29–48, 2022

Laurent Valentin Jospin, Hamid Laga, Farid Boussaid, Wray Buntine, and Mohammed Bennamoun. Hands-on bayesian neural networks—a tutorial for deep learning users.IEEE Computational Intelligence Magazine, 17 (2):29–48, 2022. doi:10.1109/MCI.2022.3155327. 16

work page doi:10.1109/mci.2022.3155327 2022