pith. sign in

arxiv: 2606.08249 · v1 · pith:RWGLCZGPnew · submitted 2026-06-06 · 💻 cs.RO · cs.LG

Disturbance-Aware Aerial Robotics for Ethical Wildlife Monitoring

Pith reviewed 2026-06-27 19:34 UTC · model grok-4.3

classification 💻 cs.RO cs.LG
keywords aerial roboticswildlife monitoringreinforcement learningdisturbance minimizationethical conservationsimulation-based traininganimal behaviorconservation technology
0
0 comments X

The pith

Reinforcement learning policies for drones outperform rule-based baselines in tracking wildlife while minimizing disturbance.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper develops a reinforcement learning approach for drones to monitor wildlife ethically by avoiding behavioral changes. It creates a simulation using real animal movement data to train policies that trade off between collecting good observations and not disturbing the animals. The method is tested on pigeons, jackals, and spur-winged lapwings with different natural behaviors. The resulting policies beat existing rule-based methods and work across different tasks, animals, and drone types without needing real-world training data. A sympathetic reader would care because this offers a scalable way to study animals without the ethical problems of traditional methods.

Core claim

The paper claims that a disturbance-aware reinforcement-learning-based framework, coupling a zoologically grounded simulation with fitted animal movement models and a reward that balances observation quality and disturbance risk, produces policies that consistently surpass rule-based baselines across three species and four behavior models while generalizing across tasks, dynamics, and drone types.

What carries the argument

Disturbance-aware reinforcement learning framework trained in a simulation environment fitted to real animal trajectories, using a reward that captures the observation-disturbance trade-off.

If this is right

  • Learned policies can be deployed on heterogeneous aerial robotic fleets for autonomous tracking.
  • Policies generalize without retraining to new monitoring tasks, animal dynamics, and drone types.
  • This establishes disturbance-aware learning as a viable foundation for non-invasive autonomous wildlife observation.
  • Opens a path towards scalable, ethically responsible robotic monitoring in ecology and conservation.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • This framework could be adapted for monitoring in other sensitive environments where minimizing impact is key.
  • Real-world validation would be needed to confirm the simulation accurately predicts disturbance levels.
  • Such methods might influence policy on the use of drones in conservation by providing data on minimal-impact approaches.

Load-bearing premise

The zoologically grounded simulation environment accurately represents natural animal behaviors and the observation-disturbance trade-off using models fitted from real trajectory statistics.

What would settle it

A real-world experiment deploying the policies on drones near actual animals and finding that disturbance levels are not lower than with rule-based methods or that generalization fails for new species.

Figures

Figures reproduced from arXiv: 2606.08249 by Isac Paulsson, Mahmut Osmanovic, Teddy Lazebnik.

Figure 1
Figure 1. Figure 1: Example procedurally generated resource map. The heatmap shows spatially varying encounter proba [PITH_FULL_IMAGE:figures/full_fig_p007_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Visual representation of static disturbance function. Top left: Combined radial and altitude gain func [PITH_FULL_IMAGE:figures/full_fig_p011_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Pareto frontier between monitoring reward and disturbance penalty across all evaluated policy configu [PITH_FULL_IMAGE:figures/full_fig_p016_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Comparison between SAC policies trained on fitted simulated movement priors and SAC policies trained [PITH_FULL_IMAGE:figures/full_fig_p018_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: High level environment function flow chart [PITH_FULL_IMAGE:figures/full_fig_p028_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: SAC architecture. The diagram illustrates the interaction between the stochastic actor, twin Q-function [PITH_FULL_IMAGE:figures/full_fig_p031_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: DQN architecture, two-layer MLPs with LeakyReLU activations. Experience is stored in a replay buffer [PITH_FULL_IMAGE:figures/full_fig_p031_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: PPO architecture, two-layer MLPs with LeakyReLU activations. [PITH_FULL_IMAGE:figures/full_fig_p032_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Final obtained reward (using PPO) in each of the cases is approximately 0.06, 0.50 and 0.75. At 0.06 [PITH_FULL_IMAGE:figures/full_fig_p035_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Training dynamics across learning algorithms. Each row shows the evolution of total reward ( [PITH_FULL_IMAGE:figures/full_fig_p036_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: Spatial distribution of drone positions relative to the animal (located at position [PITH_FULL_IMAGE:figures/full_fig_p037_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: Spatial distribution of drone positions relative to the animal (located at position [PITH_FULL_IMAGE:figures/full_fig_p038_12.png] view at source ↗
Figure 13
Figure 13. Figure 13: Top-down spatial distribution of drone positions relative to the target for policies trained on synthetic [PITH_FULL_IMAGE:figures/full_fig_p039_13.png] view at source ↗
Figure 14
Figure 14. Figure 14: Top-down spatial distribution of drone positions relative to the animal for the best movement-prior [PITH_FULL_IMAGE:figures/full_fig_p040_14.png] view at source ↗
read the original abstract

Reliable wildlife monitoring is essential for ecology and conservation, yet many existing methods, such as tagging, capture, and close-range observation, can alter the very behaviors they aim to measure. Aerial robots offer a scalable alternative, which has shown promising performance in multiple studies. Nonetheless, existing approaches typically lack behavioral awareness, rely on fixed heuristics, or require real-world training data that are costly, impractical, and ethically difficult to obtain. As a result, there remains no general framework for adaptive drone-based monitoring that can both preserve ecological validity and scale across species, behaviors, and robotic platforms. In this study, we introduce a disturbance-aware reinforcement-learning-based framework for heterogeneous aerial robotic fleets that enables autonomous wildlife tracking while explicitly minimizing behavioral disruption. We couple a zoologically grounded simulation environment with fitted animal movement models derived from real trajectory statistics, and train control policies using a reward formulation that captures the trade-off between observation quality and disturbance risk. Across three species (pigeon, jackal, and spur-winged lapwing) with distinct ecologies and motion patterns and four increasingly strategic behavior models common in nature, the learned policies consistently surpassed currently used rule-based baselines and generalized across monitoring tasks, animal dynamics, and drone types. These results establish disturbance-aware learning as a viable foundation for non-invasive autonomous wildlife observation, opening a path towards scalable, ethically responsible, and scientifically reliable robotic monitoring in ecology and conservation.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The paper introduces a disturbance-aware reinforcement learning framework for heterogeneous aerial robotic fleets to perform autonomous wildlife tracking while minimizing behavioral disruption. It couples a zoologically grounded simulation (with animal movement models fitted to real trajectory statistics) and a reward formulation trading off observation quality against disturbance risk. Across three species (pigeon, jackal, spur-winged lapwing) and four increasingly strategic behavior models, the learned policies are reported to outperform rule-based baselines and to generalize across monitoring tasks, animal dynamics, and drone types.

Significance. If the simulation accurately captures the observation-disturbance trade-off and the reported performance gains hold under rigorous validation, the work would provide a scalable, simulation-only training route for ethically responsible drone-based monitoring that avoids real-world training data collection. This could meaningfully advance non-invasive methods in ecology and conservation.

major comments (2)
  1. [Abstract] Abstract: the central claim that 'the learned policies consistently surpassed currently used rule-based baselines and generalized across monitoring tasks, animal dynamics, and drone types' is presented without any description of experimental setup, number of trials, statistical tests, error bars, or cross-validation procedure, rendering it impossible to evaluate whether the results support the generalization claim.
  2. [Simulation environment and evaluation sections] Simulation environment and evaluation sections: animal movement models are fitted to real trajectory statistics, yet no real-world measurements of drone-induced disturbance (flight initiation distance, evasion tactics, or habituation) are reported; all training and testing remain in simulation. This is load-bearing for the transferability and ethical-validity claims, because any mismatch between modeled and actual behavioral responses would invalidate the reward formulation and resulting policies.
minor comments (1)
  1. [Abstract] The abstract states results for 'four increasingly strategic behavior models common in nature' but does not name or briefly characterize those models; adding one sentence would improve readability.

Simulated Author's Rebuttal

2 responses · 1 unresolved

We thank the referee for the constructive feedback. We address each major comment below, indicating planned revisions where appropriate.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the central claim that 'the learned policies consistently surpassed currently used rule-based baselines and generalized across monitoring tasks, animal dynamics, and drone types' is presented without any description of experimental setup, number of trials, statistical tests, error bars, or cross-validation procedure, rendering it impossible to evaluate whether the results support the generalization claim.

    Authors: We agree that the abstract would benefit from additional context. In the revised manuscript we will expand the abstract to briefly note the simulation-based experimental setup, the use of multiple trials across species and drone types, and that performance comparisons include statistical measures. Full details on trial counts, variance, and cross-validation procedures already appear in the Evaluation section; the abstract revision will improve accessibility without altering its length constraints. revision: yes

  2. Referee: [Simulation environment and evaluation sections] Simulation environment and evaluation sections: animal movement models are fitted to real trajectory statistics, yet no real-world measurements of drone-induced disturbance (flight initiation distance, evasion tactics, or habituation) are reported; all training and testing remain in simulation. This is load-bearing for the transferability and ethical-validity claims, because any mismatch between modeled and actual behavioral responses would invalidate the reward formulation and resulting policies.

    Authors: The work is deliberately scoped as a simulation-only framework precisely to avoid the ethical and practical difficulties of collecting real-world training data on disturbance, as stated in the introduction. Animal movement models are fitted to published trajectory statistics and disturbance parameters are drawn from existing zoological literature. We will add an explicit limitations section that qualifies all generalization and ethical claims to the simulated setting and discusses the modeling assumptions. We cannot supply new empirical disturbance measurements because none were collected. revision: partial

standing simulated objections not resolved
  • Absence of new real-world measurements of drone-induced disturbance (flight initiation distance, evasion tactics, or habituation)

Circularity Check

0 steps flagged

No circularity: simulation-trained RL policies evaluated against independent baselines

full rationale

The paper derives policies via reinforcement learning in a simulation whose animal dynamics are fitted once from external trajectory data; the reward explicitly trades off two distinct objectives, and performance claims rest on out-of-sample comparisons to rule-based baselines within that simulation. No step reduces a claimed prediction to a fitted parameter by definition, no self-citation chain carries the central result, and the derivation chain remains independent of its own outputs.

Axiom & Free-Parameter Ledger

2 free parameters · 1 axioms · 0 invented entities

The central claim depends on the fidelity of fitted animal models and the simulation capturing disturbance effects; these are domain assumptions pulled from real trajectory data but without independent verification details in the abstract.

free parameters (2)
  • reward formulation weights
    The reward captures the trade-off between observation quality and disturbance risk, implying tunable parameters fitted during training.
  • animal movement model parameters
    Models are fitted from real trajectory statistics, introducing parameters chosen to match observed data.
axioms (1)
  • domain assumption Fitted animal movement models from real trajectories sufficiently represent natural behaviors and disturbance responses across species
    Invoked to create the zoologically grounded simulation environment used for training.

pith-pipeline@v0.9.1-grok · 5782 in / 1360 out tokens · 15712 ms · 2026-06-27T19:34:29.590940+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

94 extracted references

  1. [1]

    Linking movement ecology with wildlife management and conser- vation.Frontiers in Ecology and Evolution, 3:155, 2016

    Andrew M Allen and Navinder J Singh. Linking movement ecology with wildlife management and conser- vation.Frontiers in Ecology and Evolution, 3:155, 2016

  2. [2]

    Novel opportunities for wildlife conservation and research with real-time monitoring.Ecological Applications, 24(4):593–601, 2014

    Jake Wall, George Wittemyer, Brian Klinkenberg, and Iain Douglas-Hamilton. Novel opportunities for wildlife conservation and research with real-time monitoring.Ecological Applications, 24(4):593–601, 2014

  3. [3]

    John Wiley & Sons, 2001

    Caryl L Elzinga, Daniel W Salzer, John W Willoughby, and James P Gibbs.Monitoring plant and animal populations: a handbook for field biologists. John Wiley & Sons, 2001

  4. [4]

    Justin Irvine, Philip A

    Sol `ene Marion, Althea Davies, Urˇska Demˇsar, R. Justin Irvine, Philip A. Stephens, and Jed Long. A system- atic review of methods for studying the impacts of outdoor recreation on terrestrial wildlife.Global Ecology and Conservation, 22:e00917, 2020

  5. [5]

    Lahoz-Monfort and Michael J

    Jos ´e J. Lahoz-Monfort and Michael J. L. Magrath. A comprehensive overview of technologies for species and habitat monitoring and conservation.BioScience, 71(10):1038–1062, 2021

  6. [6]

    Teddy Lazebnik, Yehuda Samuel, Jonathan Tichon, Roi Lapid, Roni King, Tomer Nissimyan, and Orr Spiegel. An empirically-parameterized spatio-temporal extended-sir model for combined dilution and vac- cination mitigation for rabies outbreaks in wild jackals.Ecological Modelling, 514:111487, 2026

  7. [7]

    Selection of timescales to study social network temporal dynamics in vultures.Animal Behaviour, 232:123442, 2026

    Kaija Gahm, Elvira D’Bastiani, Nili Anglister, Gideon Vaadia, Marta Ac ´acio, Orr Spiegel, and Noa Pinter- Wollman. Selection of timescales to study social network temporal dynamics in vultures.Animal Behaviour, 232:123442, 2026

  8. [8]

    Cole Burton, Eric Neilson, Dario Moreira, Andrew Ladle, Robin Steenweg, Jason T

    A. Cole Burton, Eric Neilson, Dario Moreira, Andrew Ladle, Robin Steenweg, Jason T. Fisher, Erin Bayne, and Stan Boutin. Wildlife camera trapping: a review and recommendations for linking surveys to ecological processes.Journal of Applied Ecology, 52(3):675–685, 2015

  9. [9]

    Kolowski and Tavis D

    Joseph M. Kolowski and Tavis D. Forrester. Camera trap placement and the potential for bias due to trails and other features.PLOS ONE, 12(10):e0186679, 2017

  10. [10]

    Cole Burton, Douglas A

    Anthony Caravaggi, A. Cole Burton, Douglas A. Clark, Jason T. Fisher, Amelia Grass, Sian Green, Catherine Hobaiter, Tim R. Hofmeester, Ammie K. Kalan, Daniella Rabaiotti, and Danielle Rivet. A review of factors to consider when using camera traps to study animal behavior to inform wildlife ecology and conservation. Conservation Science and Practice, 2(8):...

  11. [11]

    Gallagher, Robert Hering, Thomas M¨uller, Marlee Tucker, Marco Apollonio, Janosch Arnold, et al

    Jonas Stiegler, Cara A. Gallagher, Robert Hering, Thomas M¨uller, Marlee Tucker, Marco Apollonio, Janosch Arnold, et al. Mammals show faster recovery from capture and tagging in human-disturbed landscapes. Nature Communications, 15:8079, 2024

  12. [12]

    Monitoring animal behaviour and environmental interactions using wireless sensor networks, gps collars and satellite remote sensing.Sensors, 9(05):3586–3603, 2009

    Rebecca N Handcock, Dave L Swain, Greg J Bishop-Hurley, Kym P Patison, Tim Wark, Philip Valencia, Peter Corke, and Christopher J O’Neill. Monitoring animal behaviour and environmental interactions using wireless sensor networks, gps collars and satellite remote sensing.Sensors, 9(05):3586–3603, 2009

  13. [13]

    The effects of gps collars on african elephant (loxodonta africana) behavior at the san diego zoo safari park.Applied Animal Behaviour Science, 142(1-2):76–81, 2012

    Kristina Marie Horback, Lance Joseph Miller, Jeffrey Andrews, Stanley Abraham Kuczaj II, and Matthew Anderson. The effects of gps collars on african elephant (loxodonta africana) behavior at the san diego zoo safari park.Applied Animal Behaviour Science, 142(1-2):76–81, 2012. Draft: June 9, 2026 23

  14. [14]

    Drones and ai-driven solutions for wildlife monitoring.Drones, 9(7):455, 2025

    Nourdine Aliane. Drones and ai-driven solutions for wildlife monitoring.Drones, 9(7):455, 2025

  15. [15]

    Lucia Pedrazzi, Hemal Naik, Chris Sandbrook, Miguel Lurgi, Ines F ¨urtbauer, and Andrew J. King. Advanc- ing animal behaviour research using drone technology.Animal Behaviour, 222:123147, 2025

  16. [16]

    Robotic ecology: Tracking small dynamic animals with an autonomous aerial vehicle.Science robotics, 3(23):eaat8409, 2018

    Oliver M Cliff, Debra L Saunders, and Robert Fitch. Robotic ecology: Tracking small dynamic animals with an autonomous aerial vehicle.Science robotics, 3(23):eaat8409, 2018

  17. [17]

    Wildlife research and management methods in the 21st century: Where do unmanned aircraft fit in?Journal of Unmanned Vehicle Systems, 3(4):137–155, 2015

    Dominique Chabot and David M Bird. Wildlife research and management methods in the 21st century: Where do unmanned aircraft fit in?Journal of Unmanned Vehicle Systems, 3(4):137–155, 2015

  18. [18]

    Review on robotic systems for environmental monitoring.IEEE Open Journal of Instrumentation and Measurement, 4:1–17, 2024

    DMG Preethichandra, Lasitha Piyathilaka, and Umer Izhar. Review on robotic systems for environmental monitoring.IEEE Open Journal of Instrumentation and Measurement, 4:1–17, 2024

  19. [19]

    Hvala, Rebecca M

    Aliesha H. Hvala, Rebecca M. Rogers, Mamoun Alazab, and Hamish A. Campbell. Supplementing aerial drone surveys with biotelemetry data validates wildlife detection probabilities.Frontiers in Conservation Science, 4:1203736, 2023

  20. [20]

    Elephant habituation to drones as a behavioural observation tool.Scientific Reports, 15:39329, 2025

    Angus Carey-Douglas, Liam Jasperse-Sjolander, Paul Kokiro, Gideon Galimogle Ilterewa, David Lolchuragi, Jemima Elizabeth Scrase, Frank Pope, Fritz V ollrath, and Giacomo D’Ammando. Elephant habituation to drones as a behavioural observation tool.Scientific Reports, 15:39329, 2025

  21. [21]

    Gokarna Jung Thapa, Kanchan Thapa, Shashank Poudel, Dil Bahadur Purja Pun, Sujita Shrestha, Prem Poudel, Hari Bhadra Acharya, Bal Kumar Lamsal, Rajesh Sada, and Serge A. Wich. Eyes in the sky: Drone monitoring of the largest gharial and mugger populations in the east rapti river, chitwan national park.PLOS ONE, 20(8):e0330350, 2025

  22. [22]

    Challenges and limitations of ai in wildlife con- servation

    Uma Shekhawat, Udham Singh Rana, and Neenu Saini. Challenges and limitations of ai in wildlife con- servation. InAI and Machine Learning Techniques for Wildlife Conservation, pages 97–132. IGI Global Scientific Publishing, 2025

  23. [23]

    Impact of drone disturbances on wildlife: A review.Drones, 9(4):311, 2025

    Salman Afridi et al. Impact of drone disturbances on wildlife: A review.Drones, 9(4):311, 2025

  24. [24]

    Vanderklift, and Amanda J

    Daniel Axford, Ferdous Sohel, Mathew A. Vanderklift, and Amanda J. Hodgson. Collectively advancing deep learning for animal detection in drone imagery: Successes, challenges, and research gaps.Ecological Informatics, 83:102842, 2024

  25. [25]

    Xiaohui Li, Hailong Huang, and Andrey V . Savkin. Autonomous navigation of an aerial drone to observe a group of wild animals with reduced visual disturbance.IEEE Systems Journal, 16(2):3339–3348, 2022

  26. [26]

    Kline, Alison Zhong, Kevyn Irizarry, Charles V

    Jenna M. Kline, Alison Zhong, Kevyn Irizarry, Charles V . Stewart, Christopher Stewart, Daniel I. Rubenstein, and Tanya Berger-Wolf. Wildwing: An open-source, autonomous and affordable uas for animal behaviour video monitoring.Methods in Ecology and Evolution, 2025

  27. [27]

    Wilddrone: autonomous drone technology for monitoring wildlife populations

    Ulrik Pagh Schultz Lundquist, Salman Afridi, Camille Berthelot, Nguyen Ngoc Dat, Konrad Hlebowicz, Edoardo Iannino, et al. Wilddrone: autonomous drone technology for monitoring wildlife populations. Frontiers in Robotics and AI, 12:1695319, 2026

  28. [28]

    Deep reinforcement learning of mobile robot navigation in dynamic environment: A review.Sensors, 25(11):3394, 2025

    Yingjie Zhu, Wan Zuha Wan Hasan, Hafiz Rashidi Harun Ramli, Nor Mohd Haziq Norsahperi, Muhamad Saufi Mohd Kassim, and Yiduo Yao. Deep reinforcement learning of mobile robot navigation in dynamic environment: A review.Sensors, 25(11):3394, 2025

  29. [29]

    Adaptive policy switching for efficient multi-robot coordination using reinforcement learning.International Journal of Robotics & Control Systems, 5(6), 2025

    Mohamed Nadour, Lakhmissi Cherroun, Imad Eddine Tibermacine, Abdelaziz Rabehi, and Alfian Ma’arif. Adaptive policy switching for efficient multi-robot coordination using reinforcement learning.International Journal of Robotics & Control Systems, 5(6), 2025

  30. [30]

    Reinforcement learning in robotic applications: a comprehensive survey.Artificial intelligence review, 55(2):945–990, 2022

    Bharat Singh, Rajesh Kumar, and Vinay Pratap Singh. Reinforcement learning in robotic applications: a comprehensive survey.Artificial intelligence review, 55(2):945–990, 2022. Draft: June 9, 2026 24

  31. [31]

    Autonomous obstacle avoidance and target tracking of uav based on deep reinforcement learning.Journal of Intelligent & Robotic Systems, 104:60, 2022

    Guoqiang Xu, Weilai Jiang, Zhaolei Wang, and Yaonan Wang. Autonomous obstacle avoidance and target tracking of uav based on deep reinforcement learning.Journal of Intelligent & Robotic Systems, 104:60, 2022

  32. [32]

    Xiaoran Kong, Yatong Zhou, Zhe Li, and Shaohai Wang. Multi-uav simultaneous target assignment and path planning based on deep reinforcement learning in dynamic multiple obstacles environments.Frontiers in Neurorobotics, 17:1302898, 2024

  33. [33]

    Socially aware navigation for mobile robots: a survey on deep reinforcement learning approaches.Applied Intelligence, 56(1):38, 2026

    Ibrahim Khalil Kabir and Muhammad Faizan Mysorewala. Socially aware navigation for mobile robots: a survey on deep reinforcement learning approaches.Applied Intelligence, 56(1):38, 2026

  34. [34]

    Evolution of socially-aware robot navigation.Electronics, 12(7):1570, 2023

    Silvia Guill ´en-Ruiz, Juan Pedro Bandera, Alejandro Hidalgo-Paniagua, and Antonio Bandera. Evolution of socially-aware robot navigation.Electronics, 12(7):1570, 2023

  35. [35]

    Multi-objective crowd-aware robot navigation system using deep reinforcement learning.Applied Soft Computing, 151:111154, 2024

    Chien-Lun Cheng, Chen-Chien Hsu, Saeed Saeedvand, and Jun-Hyung Jo. Multi-objective crowd-aware robot navigation system using deep reinforcement learning.Applied Soft Computing, 151:111154, 2024

  36. [36]

    Crowd-aware socially compliant robot navigation via deep reinforcement learning.International Journal of Social Robotics, 16:197–209, 2024

    Bingxin Xue, Ming Gao, Chaoqun Wang, Yao Cheng, and Fengyu Zhou. Crowd-aware socially compliant robot navigation via deep reinforcement learning.International Journal of Social Robotics, 16:197–209, 2024

  37. [37]

    Yoccoz, James D

    Nigel G. Yoccoz, James D. Nichols, and Thierry Boulinier. Monitoring of biological diversity in space and time.Trends in Ecology & Evolution, 16(8):446–453, 2001

  38. [38]

    Crofoot, Walter Jetz, and M

    Roland Kays, M.C. Crofoot, Walter Jetz, and M. Wikelski. Terrestrial animal tracking as an eye on life and planet.Science, 348:aaa2478–1, 01 2015

  39. [39]

    Human disturbance: people as predation-free predators?Journal of Applied Ecology, 41(2):335–343, 2004

    Colin M Beale and Pat Monaghan. Human disturbance: people as predation-free predators?Journal of Applied Ecology, 41(2):335–343, 2004

  40. [40]

    Meta-analysis of transmitter effects on avian behaviour and ecology.Methods in Ecology and Evolution, 1(2):180–187, 2010

    David G Barron, Jeffrey D Brawn, and Patrick J Weatherhead. Meta-analysis of transmitter effects on avian behaviour and ecology.Methods in Ecology and Evolution, 1(2):180–187, 2010

  41. [41]

    Bodey, Ian R

    Thomas W. Bodey, Ian R. Cleasby, Fraser Bell, Nicole Parr, Anthony Schultz, Stephen C. V otier, and Stuart Bearhop. A phylogenetically controlled meta-analysis of biologging device effects on birds: Deleterious effects and a call for more standardized reporting of study data.Methods in Ecology and Evolution, 9(4):946– 955, 2018

  42. [42]

    Opportunities and risks in the use of drones for studying animal behaviour

    Lukas Schad and Julia Fischer. Opportunities and risks in the use of drones for studying animal behaviour. Methods in Ecology and Evolution, 14(8):1864–1872, 2023

  43. [43]

    Unmanned aircraft systems in wildlife research: current and future applications of a transformative technology.Frontiers in Ecology and the Environment, 14(5):241–251, 2016

    Katherine S Christie, Sarah L Gilbert, Corinne L Brown, Michael Hatfield, and Lori Hanson. Unmanned aircraft systems in wildlife research: current and future applications of a transformative technology.Frontiers in Ecology and the Environment, 14(5):241–251, 2016

  44. [44]

    Approaching birds with drones: first experiments and ethical guidelines.Biology Letters, 11(2):20140754, 02 2015

    Elisabeth Vas, Am´elie Lescro¨el, Olivier Duriez, Guillaume Boguszewski, and David Gr´emillet. Approaching birds with drones: first experiments and ethical guidelines.Biology Letters, 11(2):20140754, 02 2015

  45. [45]

    Spiegel, Eleanor R

    Isla Duporge, Marcus P. Spiegel, Eleanor R. Thomson, Tatiana Chapman, Curt Lamberth, Caroline Pond, David W. Macdonald, Tiejun Wang, and Holger Klinck. Determination of optimal flight altitude to min- imise acoustic drone disturbance to wildlife using species audiograms.Methods in Ecology and Evolution, 12(11):2196–2207, 2021

  46. [46]

    A meta-analysis of the impact of drones on birds.Fron- tiers in Ecology and the Environment, 23(2):e2809, 2025

    ´Emile Brisson-Curadeau, Rose Lacombe, Marianne Gousy-Leblanc, Vanessa Poirier, Lauren Jackson, Christina Petalas, Eliane Miranda, Alyssa Eby, Julia Baak, Don-Jean L ´eandri-Breton, Emily Choy, Jade Legros, Elena Tranze-Drabinia, and Kyle H Elliott. A meta-analysis of the impact of drones on birds.Fron- tiers in Ecology and the Environment, 23(2):e2809, 2...

  47. [47]

    Human-aware robot navigation: A survey.Robotics and Autonomous Systems, 61(12):1726–1743, 2013

    Thibault Kruse, Amit Kumar Pandey, Rachid Alami, and Alexandra Kirsch. Human-aware robot navigation: A survey.Robotics and Autonomous Systems, 61(12):1726–1743, 2013

  48. [48]

    Hamid Rezatofighi, and Damith C

    Hoa Van Nguyen, Michael Chesser, Lian Pin Koh, S. Hamid Rezatofighi, and Damith C. Ranasinghe. Trackerbots: Autonomous unmanned aerial vehicle for real-time localization and tracking of multiple radio- tagged animals.Journal of Field Robotics, 36(3):617–635, 2019

  49. [49]

    Taggart, Katrina Falkner, S

    Fei Chen, Hoa Van Nguyen, David A. Taggart, Katrina Falkner, S. Hamid Rezatofighi, and Damith C. Ranas- inghe. Conservationbots: Autonomous aerial robot for fast robust wildlife tracking in complex terrains. Journal of Field Robotics, 41(2):443–469, 2024

  50. [50]

    Decentralized multi-drone coordination for wildlife video acquisition

    Denys Grushchak, Jenna Kline, Danilo Pianini, Nicolas Farabegoli, Gianluca Aguzzi, Martina Baiardi, and Christopher Stewart. Decentralized multi-drone coordination for wildlife video acquisition. In2024 IEEE International Conference on Autonomic Computing and Self-Organizing Systems (ACSOS), pages 31–40, 2024

  51. [51]

    Human-caused disturbance stimuli as a form of predation risk.Conser- vation Ecology, 6(1), 2002

    Alejandro Frid and Lawrence Dill. Human-caused disturbance stimuli as a form of predation risk.Conser- vation Ecology, 6(1), 2002

  52. [52]

    Lima and Peter A

    Steven L. Lima and Peter A. Bednekoff. Temporal variation in danger drives antipredator behavior: The predation risk allocation hypothesis.The American Naturalist, 153(6):649–659, 1999

  53. [53]

    Unmanned aircraft systems as a new source of disturbance for wildlife: A systematic review.PLOS ONE, 12(6):1–14, 06 2017

    Margarita Mulero-P ´azm´any, Susanne Jenni-Eiermann, Nicolas Strebel, Thomas Sattler, Juan Jos ´e Negro, and Zulima Tablado. Unmanned aircraft systems as a new source of disturbance for wildlife: A systematic review.PLOS ONE, 12(6):1–14, 06 2017

  54. [54]

    Emily Bennitt, Hattie L. A. Bartlam-Brooks, Tatjana Y . Hubel, and Alan M. Wilson. Terrestrial mammalian wildlife responses to unmanned aerial systems approaches.Scientific Reports, 9(1):2142, Feb 2019

  55. [55]

    Emergent impacts of multiple predators on prey.Trends in Ecology & Evolution, 13(9):350–355, 1998

    Andrew Sih, Goran Englund, and David Wooster. Emergent impacts of multiple predators on prey.Trends in Ecology & Evolution, 13(9):350–355, 1998

  56. [56]

    A movement ecology paradigm for unifying organismal movement research.Proceedings of the National Academy of Sciences, 105(49):19052–19059, 2008

    Ran Nathan, Wayne M Getz, Eloy Revilla, Marcel Holyoak, Ronen Kadmon, David Saltz, and Peter E Smouse. A movement ecology paradigm for unifying organismal movement research.Proceedings of the National Academy of Sciences, 105(49):19052–19059, 2008

  57. [57]

    Fagan and Justin M

    William F. Fagan and Justin M. Calabrese. The correlated random walk and the rise of movement ecology. The Bulletin of the Ecological Society of America, 95(3):204–206, 2014

  58. [58]

    P. M. Kareiva and N. Shigesada. Analyzing insect movement as a correlated random walk.Oecologia, 56(2):234–238, Feb 1983

  59. [59]

    Random walk models in biology.Journal of The Royal Society Interface, 5(25):813–834, 04 2008

    Edward A Codling, Michael J Plank, and Simon Benhamou. Random walk models in biology.Journal of The Royal Society Interface, 5(25):813–834, 04 2008

  60. [60]

    Hills, and Inon Scharf

    Arik Dorfman, Thomas T. Hills, and Inon Scharf. A guide to area-restricted search: a foundational foraging behaviour.Biological Reviews, 97(6):2076–2089, 2022

  61. [61]

    The use of space by animals as a function of accessibility and preference.Ecological Modelling, 159(2):239–268, 2003

    Jason Matthiopoulos. The use of space by animals as a function of accessibility and preference.Ecological Modelling, 159(2):239–268, 2003

  62. [62]

    Sutton and Andrew G

    Richard S. Sutton and Andrew G. Barto.Reinforcement Learning: An Introduction. MIT Press, Cambridge, MA, 2 edition, 2018

  63. [63]

    Multi-objective reinforcement learning based on decomposition: A taxonomy and framework.J

    Florian Felten, El-Ghazali Talbi, and Gr ´egoire Danoy. Multi-objective reinforcement learning based on decomposition: A taxonomy and framework.J. Artif. Int. Res., 79, April 2024. Draft: June 9, 2026 26

  64. [64]

    Deep reinforcement learning for robotics: A survey of real-world successes.Annual Review of Control, Robotics, and Autonomous Systems, 8(V olume 8, 2025):153–188, 2025

    Chen Tang, Ben Abbatematteo, Jiaheng Hu, Rohan Chandra, Roberto Mart´ın-Mart´ın, and Peter Stone. Deep reinforcement learning for robotics: A survey of real-world successes.Annual Review of Control, Robotics, and Autonomous Systems, 8(V olume 8, 2025):153–188, 2025

  65. [65]

    A survey of multi-objective sequential decision-making.Journal of Artificial Intelligence Research, 48:67–113, 2013

    Diederik M Roijers, Peter Vamplew, Shimon Whiteson, and Richard Dazeley. A survey of multi-objective sequential decision-making.Journal of Artificial Intelligence Research, 48:67–113, 2013

  66. [66]

    Partially observable markov decision processes in robotics: A survey.IEEE Transactions on Robotics, 39(1):21–40, 2023

    Mikko Lauri, David Hsu, and Joni Pajarinen. Partially observable markov decision processes in robotics: A survey.IEEE Transactions on Robotics, 39(1):21–40, 2023

  67. [67]

    Littman, and Anthony R

    Leslie Pack Kaelbling, Michael L. Littman, and Anthony R. Cassandra. Planning and acting in partially observable stochastic domains.Artificial Intelligence, 101(1):99–134, 1998

  68. [68]

    A comprehensive survey on safe reinforcement learning.Journal of Machine Learning Research, 16(42):1437–1480, 2015

    Javier Garc ´ıa and Fernando Fern´andez. A comprehensive survey on safe reinforcement learning.Journal of Machine Learning Research, 16(42):1437–1480, 2015

  69. [69]

    Sim-to-real transfer in deep reinforcement learning for robotics: a survey

    Wenshuai Zhao, Jorge Pe ˜na Queralta, and Tomi Westerlund. Sim-to-real transfer in deep reinforcement learning for robotics: a survey. In2020 IEEE Symposium Series on Computational Intelligence (SSCI), pages 737–744, 2020

  70. [70]

    Policy invariance under reward transformations: theory and application to reward shaping

    Andrew Y Ng, Daishi Harada, and Stuart J Russell. Policy invariance under reward transformations: theory and application to reward shaping. InInternational Conference on Machine Learning (ICML), pages 278– 287, 1999

  71. [71]

    Barrientos R., Marie Chantelle C

    David J. Barrientos R., Marie Chantelle C. Medina, Bruno J. T. Fernandes, and Pablo V . A. Barros. The use of reinforcement learning algorithms in object tracking: A systematic literature review.Neurocomputing, 596:127954, 2024

  72. [72]

    Drone navigation and avoidance of obstacles through deep reinforcement learning

    Ender C ¸ etin, Cristina Barrado, Guillem Mu˜noz, Miquel Macias, and Enric Pastor. Drone navigation and avoidance of obstacles through deep reinforcement learning. In2019 IEEE/AIAA 38th Digital Avionics Systems Conference (DASC), pages 1–7, 2019

  73. [73]

    Compass: Cooperative multi-agent persistent monitoring using spatio-temporal attention network

    Xingjian Zhang, Yizhuo Wang, and Guillaume Sartoretti. Compass: Cooperative multi-agent persistent monitoring using spatio-temporal attention network. In2025 IEEE International Symposium on Multi-Robot and Multi-Agent Systems (MRS), pages 1–7, 2025

  74. [74]

    Michael Bar-Ziv, Hilla Ziv, Mookie Breuer, Eitam Arnon, Assaf Uzan, and Orr Spiegel. Spur-winged lap- wings show spatial behavioural types with different mobility and exploration between urban and rural indi- viduals.Proceedings of the Royal Society B: Biological Sciences, 292(2038):20242471, 2025

  75. [75]

    Individual variation affects outbreak magnitude and predictability in multi- pathogen model of pigeons visiting dairy farms.Ecological Modelling, 499:110925, 2025

    Teddy Lazebnik and Orr Spiegel. Individual variation affects outbreak magnitude and predictability in multi- pathogen model of pigeons visiting dairy farms.Ecological Modelling, 499:110925, 2025

  76. [76]

    Procedural environment generation for cave 3d model using opensimplex noise and marching cube

    Khalil Satyadama,{Reza Fuad}Rachmadi, and{Supeno Mardi Susiki}Nugroho. Procedural environment generation for cave 3d model using opensimplex noise and marching cube. InCENIM 2020 - Proceeding, CENIM 2020 - Proceeding: International Conference on Computer Engineering, Network, and Intelligent Multimedia 2020, pages 144–148, United States, November 2020. In...

  77. [77]

    Soluk and Nicholas C

    Daniel A. Soluk and Nicholas C. Collins. Synergistic interactions between fish and stoneflies: Facilitation and interference among stream predators.Oikos, 52(1):94–100, 1988

  78. [78]

    Proximal policy optimiza- tion algorithms, 2017

    John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, and Oleg Klimov. Proximal policy optimiza- tion algorithms, 2017. Draft: June 9, 2026 27

  79. [79]

    Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor

    Tuomas Haarnoja, Aurick Zhou, Pieter Abbeel, and Sergey Levine. Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor. In Jennifer Dy and Andreas Krause, editors,Pro- ceedings of the 35th International Conference on Machine Learning, volume 80 ofProceedings of Machine Learning Research, pages 1861–1870. PMLR, 10...

  80. [80]

    Effective defense strategies in network security using improved double dueling deep q-network.Computers & Security, 136:103578, 2024

    Zhengwei Zhu, Miaojie Chen, Chenyang Zhu, and Yanping Zhu. Effective defense strategies in network security using improved double dueling deep q-network.Computers & Security, 136:103578, 2024

Showing first 80 references.