Recognition: unknown
Distill-Belief: Closed-Loop Inverse Source Localization and Characterization in Physical Fields
Pith reviewed 2026-05-07 16:12 UTC · model grok-4.3
The pith
A teacher-student framework decouples Bayesian correctness from efficient control in closed-loop source localization.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Distill-Belief is a teacher-student framework that decouples correctness from efficiency: a Bayes-correct particle-filter teacher maintains the posterior and supplies a dense information-gain signal, while a compact student distills the posterior into belief statistics for control and an uncertainty certificate for stopping. At deployment only the student is used, yielding constant per-step cost.
What carries the argument
Teacher-student distillation where the particle-filter teacher provides information-gain signals that train the student to produce usable belief statistics and stopping certificates.
If this is right
- Sensing cost decreases while success rate, posterior contraction, and estimation accuracy increase relative to baselines.
- Reward hacking is mitigated because the student is trained against a Bayes-correct information-gain signal.
- Constant per-step computation enables real-time operation under strict time constraints.
- The method generalizes across seven distinct field modalities and two stress-test conditions.
Where Pith is reading between the lines
- The same separation of rigorous teacher from lightweight student could transfer to other belief-space planning problems where exact inference is prohibitive.
- In field robotics this structure may allow low-power platforms to retain Bayesian-level uncertainty handling without on-board particle filters.
- If the student is periodically refreshed from the teacher, the framework could handle slowly drifting field statistics without full re-derivation of the policy.
Load-bearing premise
The student can approximate the teacher's information-gain signal and uncertainty estimates closely enough that the deployed policy reduces uncertainty rather than exploiting approximation errors.
What would settle it
Execute the student policy in simulation or hardware, then recompute the true posterior offline with the teacher; if uncertainty fails to contract or reward hacking reappears at rates comparable to baselines, the central claim does not hold.
Figures
read the original abstract
{Closed-loop inverse source localization and characterization (ISLC) requires a mobile agent to select measurements that localize sources and infer latent field parameters under strict time constraints.} {The core challenge lies in the belief-space objective: valid uncertainty estimation requires expensive Bayesian inference, whereas using fast learned belief model leads to reward hacking, in which the policy exploits approximation errors rather than actually reducing uncertainty.} {We propose \textbf{Distill-Belief}, a teacher--student framework that decouples correctness from efficiency. A Bayes-correct particle-filter teacher maintains the posterior and supplies a dense information-gain signal, while a compact student distills the posterior into belief statistics for control and an uncertainty certificate for stopping. At deployment, only the student is used, yielding constant per-step cost.} {Experiments on seven field modalities and two stress tests show that Distill-Belief consistently reduces sensing cost and improves success, posterior contraction, and estimation accuracy over baselines, while mitigating reward hacking.}
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes Distill-Belief, a teacher-student framework for closed-loop inverse source localization and characterization (ISLC) in physical fields. A Bayes-correct particle-filter teacher maintains the posterior and supplies a dense information-gain signal, while a compact student distills the posterior into belief statistics for control and an uncertainty certificate for stopping. At deployment only the student is used for constant per-step cost. Experiments on seven field modalities and two stress tests report consistent reductions in sensing cost and improvements in success rate, posterior contraction, and estimation accuracy over baselines, while mitigating reward hacking.
Significance. If the student's approximation of the teacher's information-gain and uncertainty signals is sufficiently faithful, the framework offers a practical route to deploy belief-space planning on resource-limited agents without incurring repeated expensive Bayesian inference. The explicit separation of correctness (teacher) from efficiency (student) and the multi-modality empirical evaluation are strengths that could influence inverse-problem robotics and active sensing literature.
major comments (3)
- [Section 3] Section 3 (distillation objective): no error bounds, Lipschitz constants, or worst-case analysis are supplied on how student approximation error propagates into the belief-space objective or the stopping criterion. This is load-bearing for the central claim that the deployed policy reduces true uncertainty rather than exploiting approximation errors.
- [Experimental evaluation] Experimental evaluation (abstract and results sections): claims of consistent gains across seven modalities and two stress tests are presented without specification of the exact baselines, statistical tests, data-exclusion rules, or implementation details, preventing verification of the reported improvements in success, contraction, and accuracy.
- [Experimental evaluation] No ablation is reported in which the teacher's information-gain signal is substituted at test time to measure whether the student-only policy achieves comparable posterior contraction; without this check the mitigation of reward hacking rests on an unverified faithfulness assumption.
minor comments (1)
- [Abstract] The abstract would be clearer if it briefly named the seven field modalities and the two stress tests rather than leaving them unspecified.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed review. We address each major comment below, indicating the revisions we will make to the manuscript.
read point-by-point responses
-
Referee: [Section 3] Section 3 (distillation objective): no error bounds, Lipschitz constants, or worst-case analysis are supplied on how student approximation error propagates into the belief-space objective or the stopping criterion. This is load-bearing for the central claim that the deployed policy reduces true uncertainty rather than exploiting approximation errors.
Authors: We acknowledge that Section 3 presents the distillation objective without formal error bounds, Lipschitz constants, or worst-case propagation analysis. Deriving such guarantees is non-trivial given the stochastic particle-filter teacher and the heterogeneous, non-convex nature of the physical fields considered; standard Lipschitz assumptions do not hold uniformly. The central claim is instead grounded in the multi-modality empirical results showing consistent posterior contraction. In the revision we will add a dedicated paragraph in Section 3 that quantifies observed student-teacher approximation error on held-out trajectories and discusses its measured effect on the stopping criterion. revision: partial
-
Referee: [Experimental evaluation] Experimental evaluation (abstract and results sections): claims of consistent gains across seven modalities and two stress tests are presented without specification of the exact baselines, statistical tests, data-exclusion rules, or implementation details, preventing verification of the reported improvements in success, contraction, and accuracy.
Authors: We agree that greater specificity is needed for reproducibility. The original manuscript describes the baselines and metrics in Section 4 and supplies implementation details in the appendix, but these elements will be consolidated and expanded in the main text. The revision will explicitly enumerate all baselines, state the statistical tests (paired t-tests at p < 0.05), confirm that no trials were excluded, and list key hyperparameters together with a pointer to the released code. revision: yes
-
Referee: [Experimental evaluation] No ablation is reported in which the teacher's information-gain signal is substituted at test time to measure whether the student-only policy achieves comparable posterior contraction; without this check the mitigation of reward hacking rests on an unverified faithfulness assumption.
Authors: We concur that this ablation would provide direct evidence of distillation faithfulness. The submitted version does not contain such an experiment. We will run the requested ablation—deploying the student policy while substituting the teacher’s information-gain signal at test time—and report the resulting posterior-contraction curves alongside the student-only results in the revised experimental section. revision: yes
Circularity Check
No circularity: framework uses standard particle-filter teacher and distillation with empirical validation
full rationale
The paper presents Distill-Belief as a teacher-student architecture where a particle-filter teacher computes exact posteriors and information gain, and the student is trained to approximate belief statistics and uncertainty certificates. Claims of reduced sensing cost, improved posterior contraction, and mitigation of reward hacking are supported by experiments on seven modalities and stress tests rather than any closed-form derivation. No equations or steps in the abstract or described framework reduce the performance metrics to a fitted quantity or self-referential definition by construction. The faithfulness assumption is an empirical claim open to falsification via the reported results, not a tautology. This matches the default case of a non-circular engineering framework relying on known Bayesian and distillation techniques.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption Particle-filter approximation yields a sufficiently accurate posterior for information-gain computation
- ad hoc to paper Distillation transfers the dense information-gain signal without introducing exploitable approximation errors
Reference graph
Works this paper leans on
-
[1]
Christophe Andrieu, Arnaud Doucet, and Roman Holenstein. 2010. Particle markov chain monte carlo methods.Journal of the Royal Statistical Society Series B: Statistical Methodology72, 3 (2010), 269–342
2010
-
[2]
Simon R Arridge. 1999. Optical tomography in medical imaging.Inverse problems 15, 2 (1999), R41
1999
-
[3]
M Sanjeev Arulampalam, Simon Maskell, Neil Gordon, and Tim Clapp. 2002. A tutorial on particle filters for online nonlinear/non-Gaussian Bayesian tracking. IEEE Transactions on signal processing50, 2 (2002), 174–188
2002
-
[4]
Pierre-Luc Bacon, Jean Harb, and Doina Precup. 2017. The option-critic architec- ture. InProceedings of the AAAI conference on artificial intelligence, Vol. 31
2017
-
[5]
Amvrossios C Bagtzoglou and Juliana Atmadja. 2005. Mathematical methods for hydrologic inversion: The case of pollution source identification. InWater Pollution: Environmental Impact Assessment of Recycled Wastes on Surface and Ground Waters; Engineering Modeling and Sustainability. Springer, 65–96
2005
-
[6]
Barto, Richard S
Andrew G. Barto, Richard S. Sutton, and Charles W. Anderson. 1983. Neu- ronlike adaptive elements that can solve difficult learning control problems. IEEE Transactions on Systems, Man, and CyberneticsSMC-13 (1983), 834–846. https://api.semanticscholar.org/CorpusID:1522994
1983
-
[7]
Frederic Bourgault, Alexei A Makarenko, Stefan B Williams, Ben Grocholsky, and Hugh F Durrant-Whyte. 2002. Information based adaptive robotic exploration. InIEEE/RSJ international conference on intelligent robots and systems, Vol. 1. IEEE, 540–545
2002
-
[8]
Axel Brandenburg and Kandaswamy Subramanian. 2005. Astrophysical magnetic fields and nonlinear dynamo theory.Physics Reports417, 1-4 (2005), 1–209
2005
-
[9]
Exploration by Random Network Distillation
Yuri Burda, Harrison Edwards, Amos J. Storkey, and Oleg Klimov. 2018. Explo- ration by Random Network Distillation.ArXivabs/1810.12894 (2018). https: //api.semanticscholar.org/CorpusID:53115163
work page Pith review arXiv 2018
-
[10]
Kathryn Chaloner and Isabella Verdinelli. 1995. Bayesian experimental design: A review.Statistical science(1995), 273–304
1995
-
[11]
Wen-Hua Chen, Callum Rhodes, and Cunjia Liu. 2021. Dual control for exploita- tion and exploration (DCEE) in autonomous search.Automatica133 (2021), 109851
2021
-
[12]
Margaret Cheney, David Isaacson, and Jonathan C Newell. 1999. Electrical impedance tomography.SIAM review41, 1 (1999), 85–101
1999
-
[13]
Fotini Katopodes Chow, Branko Kosović, and Stevens T. Chan. 2005. Source Inversion for Contaminant Plume Dispersion in Urban Environments Using Building-Resolving Simulations.Journal of Applied Meteorology and Climatology 47 (2005), 1553–1572. https://api.semanticscholar.org/CorpusID:76658133
2005
-
[14]
Diffusion Posterior Sampling for General Noisy Inverse Problems
Hyungjin Chung, Jeongsol Kim, Michael T. McCann, Marc Louis Klasky, and J. C. Ye. 2022. Diffusion Posterior Sampling for General Noisy Inverse Problems.ArXiv abs/2209.14687 (2022). https://api.semanticscholar.org/CorpusID:252596252
work page internal anchor Pith review arXiv 2022
-
[15]
Daniel H Cusworth, Riley M Duren, Alana K Ayasse, Ralph Jiorle, Katherine Howell, Andrew Aubrey, Robert O Green, Michael L Eastwood, John W Chapman, Andrew K Thorpe, et al. 2024. Quantifying methane emissions from United States landfills.Science383, 6690 (2024), 1499–1504
2024
-
[16]
Daniel Rodrigues Da Costa, Maxime Robic, Pascal Vasseur, and Fabio Morbidi
-
[17]
InIEEE International Conference on Robotics and Automation
A New Stereo Fisheye Event Camera for Fast Drone Detection and Tracking. InIEEE International Conference on Robotics and Automation
-
[18]
Matthieu Dogniaux, Joannes D Maasakkers, Marianne Girard, Dylan Jervis, Jason McKeever, Berend J Schuit, Shubham Sharma, Ana Lopez-Noreña, Daniel J Varon, and Ilse Aben. 2025. Global satellite survey reveals uncertainty in landfill methane emissions.Nature(2025), 1–6
2025
-
[19]
Ivanova, Ilyas Malik, and Tom Rainforth
Adam Foster, Desi R. Ivanova, Ilyas Malik, and Tom Rainforth. 2021. Deep Adap- tive Design: Amortizing Sequential Bayesian Experimental Design. InInterna- tional Conference on Machine Learning. https://api.semanticscholar.org/CorpusID: 232104961
2021
-
[20]
Adam Foster, Martin Jankowiak, Eli Bingham, Paul Horsfall, Yee Whye Teh, Tom Rainforth, and Noah D. Goodman. 2019. Variational Bayesian Optimal Experimental Design. InNeural Information Processing Systems. https://api. semanticscholar.org/CorpusID:173990692
2019
-
[21]
Hosker, and Jean S
Steven Hanna, Gary Allen Briggs, Rayford P. Hosker, and Jean S. Smith. 1982. Handbook on atmospheric diffusion. https://api.semanticscholar.org/CorpusID: 128993711
1982
-
[22]
Distilling the Knowledge in a Neural Network
Geoffrey E. Hinton, Oriol Vinyals, and Jeffrey Dean. 2015. Distilling the Knowledge in a Neural Network.ArXivabs/1503.02531 (2015). https://api. semanticscholar.org/CorpusID:7200347
work page internal anchor Pith review arXiv 2015
-
[23]
Jason Hite, John Mattingly, Dan Archer, Michael Willis, Andrew Rowe, Kayleigh Bray, Jake Carter, and James Ghawaly. 2019. Localization of a radioactive source in an urban environment using Bayesian Metropolis methods.Nuclear Instruments and Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and Associated Equipment915 (2019), 82–93
2019
-
[24]
Edward R Holley. 1969. Unified view of diffusion and dispersion.Journal of the Hydraulics division95, 2 (1969), 621–632
1969
-
[25]
Geoffrey A Hollinger and Gaurav S Sukhatme. 2014. Sampling-based robotic information gathering algorithms.The International Journal of Robotics Research 33, 9 (2014), 1271–1287
2014
-
[26]
Holzschuh, S
Benjamin J. Holzschuh, S. Vegetti, and Nils Thuerey. 2023. Solving Inverse Physics Problems with Score Matching. InNeural Information Processing Systems. https://api.semanticscholar.org/CorpusID:256231499
2023
-
[27]
Hangkai Hu, Shiji Song, and C. L. Phillip Chen. 2019. Plume Tracing via Model- Free Reinforcement Learning Method.IEEE Transactions on Neural Networks and Learning Systems30 (2019), 2515–2527. https://api.semanticscholar.org/CorpusID: 58623810
2019
-
[28]
Jianwen Huo, Manlu Liu, Konstantin A Neusypin, Haojie Liu, Mingming Guo, and Yufeng Xiao. 2020. Autonomous search of radioactive sources through mobile robots.Sensors20, 12 (2020), 3461
2020
-
[29]
Michael Hutchinson, Pawel Ladosz, Cunjia Liu, and Wen-Hua Chen. 2019. Experi- mental assessment of plume mapping using point measurements from unmanned vehicles. In2019 International Conference on Robotics and Automation (ICRA). IEEE, 7720–7726
2019
-
[30]
Michael Hutchinson, Cunjia Liu, and Wen-Hua Chen. 2018. Information-based search for an atmospheric release using a mobile robot: Algorithm and experi- ments.IEEE Transactions on Control Systems Technology27, 6 (2018), 2388–2402
2018
-
[31]
Michael Hutchinson, Cunjia Liu, and Wen-Hua Chen. 2019. Source term estima- tion of a hazardous airborne release using an unmanned aerial vehicle.Journal of Field Robotics36, 4 (2019), 797–817
2019
-
[32]
Michael Hutchinson, Hyondong Oh, and Wen-Hua Chen. 2018. Entrotaxis as a strategy for autonomous search and source reconstruction in turbulent conditions. Information Fusion42 (2018), 179–189
2018
-
[33]
Kenneth D Jarman, Erin A Miller, Richard S Wittman, and Christopher J Gesh
-
[34]
Bayesian radiation source localization.Nuclear technology175, 1 (2011), 326–334
2011
-
[35]
Maclean, David Marshall, Jason McKeever, Mathias Strupler, Antoine Ramier, Ewan R
Dylan Jervis, Marianne Girard, Jean-Philippe W. Maclean, David Marshall, Jason McKeever, Mathias Strupler, Antoine Ramier, Ewan R. M. Tarrant, David Young, Joannes D. Maasakkers, Ilse Aben, and Tia R. Scarpelli. 2025. Global energy sector methane emissions estimated by using facility-level satellite observations.Science 390 6778 (2025), 1151–1155. https:/...
2025
-
[36]
Xue Jiang, Rui Ma, Yanxin Wang, Wenlong Gu, Wenxi Lu, and Jin Na. 2021. Two-stage surrogate model-assisted Bayesian framework for groundwater con- taminant source identification.Journal of Hydrology594 (2021), 125955
2021
-
[37]
Adam Johansen. 2009. A tutorial on particle filtering and smoothing: Fifteen years later. (2009)
2009
-
[38]
Nikolas Kantas, Arnaud Doucet, Sumeetpal S Singh, Jan Maciejowski, and Nicolas Chopin. 2015. On particle methods for parameter estimation in state-space models. (2015)
2015
-
[39]
Keats, Eugene Yee, and F
A. Keats, Eugene Yee, and F. S. Lien. 2007. Bayesian inference for source determina- tion with applications to a complex urban environment.Atmospheric Environment 41 (2007), 465–479. https://api.semanticscholar.org/CorpusID:95480737
2007
-
[40]
Steven Kleinegesse and Michael U Gutmann. 2020. Bayesian Experimental Design for Implicit Models by Mutual Information Neural Estimation. InInternational Conference on Machine Learning. https://api.semanticscholar.org/CorpusID: 211171409
2020
-
[41]
Pawel Ladosz, Hyondong Oh, Gan Zheng, and Wen-Hua Chen. 2020. Gaussian Process Based Channel Prediction for Communication-Relay UAV in Urban Environments.IEEE Trans. Aerospace Electron. Systems56 (2020), 313–325. https: //api.semanticscholar.org/CorpusID:182548627
2020
-
[42]
Junhee Lee, Hongro Jang, Minkyu Park, and Hyondong Oh. 2025. Enhanced Re- ward Function Design for Source Term Estimation Based on Deep Reinforcement Learning.IEEE Access(2025)
2025
-
[43]
Zhongguo Li, Wen-Hua Chen, Jun Yang, and Cunjia Liu. 2024. Cooperative active learning-based dual control for exploration and exploitation in autonomous search.IEEE Transactions on Neural Networks and Learning Systems(2024)
2024
-
[44]
Jiaming Liang, Yichuan Wu, Justin K Yim, Huimin Chen, Zicong Miao, Hanxiao Liu, Ying Liu, Yixin Liu, Dongkai Wang, Wenying Qiu, et al. 2021. Electrostatic footpads enable agile insect-scale soft robots with trajectory control.Science Robotics6, 55 (2021), eabe7906
2021
-
[45]
Lillicrap, Jonathan J
Timothy P. Lillicrap, Jonathan J. Hunt, Alexander Pritzel, Nicolas Manfred Otto Heess, Tom Erez, Yuval Tassa, David Silver, and Daan Wierstra. 2015. Continuous control with deep reinforcement learning.arXiv: Learning(2015). https://api. semanticscholar.org/CorpusID:16326763
2015
-
[46]
E. J. Liu, A. Aiuppa, A. Alan, S. Arellano, M. Bitetto, N. Bobrowski, S. Carn, R. Clarke, E. Corrales, J. M. de Moor, J. A. Diaz, M. Edmonds, T. P. Fis- cher, J. Freer, G. M. Fricke, B. Galle, G. Gerdes, G. Giudice, A. Gutmann, C. Hayer, I. Itikarai, J. Jones, E. Mason, B. T. McCormick Kilbride, K. Mulina, S. Nowicki, K. Rahilly, T. Richardson, J. Rüdiger...
-
[47]
Runze Liu, Fengshuo Bai, Yali Du, and Yaodong Yang. 2022. Meta-Reward-Net: Implicitly Differentiable Reward Learning for Preference-based Reinforcement Learning. InNeural Information Processing Systems. https://api.semanticscholar. org/CorpusID:258509334 KDD ’26, August 9–13, 2026, International Convention Center Jeju (ICC Jeju), Jeju, Republic of Korea T...
2022
-
[48]
Enkeleida Lushi and John M. Stockie. 2009. An inverse Gaussian plume approach for estimating atmospheric pollutant emissions from multiple point sources. Atmospheric Environment44 (2009), 1097–1107. https://api.semanticscholar.org/ CorpusID:14414679
2009
-
[49]
JB Masson, M Bailly Bechet, and Massimo Vergassola. 2009. Chasing information to search in random environments.Journal of Physics A: Mathematical and Theoretical42, 43 (2009), 434009
2009
-
[50]
Asynchronous methods for deep reinforcement learning.arXiv preprint arXiv:1602.01783,
Volodymyr Mnih, Adrià Puigdomènech Badia, Mehdi Mirza, Alex Graves, Timo- thy P. Lillicrap, Tim Harley, David Silver, and Koray Kavukcuoglu. 2016. Asyn- chronous Methods for Deep Reinforcement Learning.ArXivabs/1602.01783 (2016). https://api.semanticscholar.org/CorpusID:6875312
-
[51]
Rusu, Joel Veness, Marc G
Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A. Rusu, Joel Veness, Marc G. Bellemare, Alex Graves, Martin A. Riedmiller, Andreas Kirkeby Fidjeland, Georg Ostrovski, Stig Petersen, Charlie Beattie, Amir Sadik, Ioannis Antonoglou, Helen King, Dharshan Kumaran, Daan Wierstra, Shane Legg, and Demis Hassabis
-
[52]
https://api.semanticscholar.org/CorpusID:205242740
Human-level control through deep reinforcement learning.Nature518 (2015), 529–533. https://api.semanticscholar.org/CorpusID:205242740
2015
-
[53]
Lundquist, Branko Kosović, Gardar Jóhannesson, Kathleen M
Luca Delle Monache, Julie K. Lundquist, Branko Kosović, Gardar Jóhannesson, Kathleen M. Dyer, Roger D. Aines, Fotini Katopodes Chow, Rich D. Belles, William G. Hanley, Shawn Larsen, Gwendolen A. Loosmore, John J. Nitao, Gayle A. Sugiyama, and Phil Vogt. 2008. Bayesian Inference and Markov Chain Monte Carlo Sampling to Reconstruct a Contaminant Source on a...
2008
-
[54]
Ng, Daishi Harada, and Stuart J
A. Ng, Daishi Harada, and Stuart J. Russell. 1999. Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping. InInternational Conference on Machine Learning. https://api.semanticscholar.org/CorpusID: 5730166
1999
-
[55]
Sterratt, and Iain Murray
George Papamakarios, David C. Sterratt, and Iain Murray. 2018. Sequential Neural Likelihood: Fast Likelihood-free Inference with Autoregressive Flows. InInternational Conference on Artificial Intelligence and Statistics. https://api. semanticscholar.org/CorpusID:29166658
2018
-
[56]
Minkyu Park, Seulbi An, Jaemin Seo, and Hyondong Oh. 2021. Autonomous source search for UAVs using Gaussian mixture model-based infotaxis: Algorithm and flight experiments.IEEE Trans. Aerospace Electron. Systems57, 6 (2021), 4238– 4254
2021
-
[57]
Minkyu Park, Pawel Ladosz, and Hyondong Oh. 2022. Source Term Estimation Using Deep Reinforcement Learning With Gaussian Mixture Model Feature Extraction for Mobile Sensors.IEEE Robotics and Automation Letters7 (2022), 8323–8330. https://api.semanticscholar.org/CorpusID:249940756
2022
-
[58]
Efros, and Trevor Darrell
Deepak Pathak, Pulkit Agrawal, Alexei A. Efros, and Trevor Darrell. 2017. Curiosity-Driven Exploration by Self-Supervised Prediction.2017 IEEE Con- ference on Computer Vision and Pattern Recognition Workshops (CVPRW)(2017), 488–489. https://api.semanticscholar.org/CorpusID:20045336
2017
-
[59]
J Picaut. 2002. Numerical modeling of urban sound fields by a diffusion process. Applied Acoustics63, 9 (2002), 965–991
2002
-
[60]
Faezeh Rahbar, Ali Marjovi, and Alcherio Martinoli. 2019. An algorithm for odor source localization based on source term estimation. In2019 International Conference on Robotics and Automation (ICRA). IEEE, 973–979
2019
-
[61]
Nicola Rigolli, Nicodemo Magnoli, Lorenzo Rosasco, and Agnese Seminara. 2021. Learning to predict target location with turbulent odor plumes.eLife11 (2021). https://api.semanticscholar.org/CorpusID:235446309
2021
-
[62]
Branko Ristic, Alex Skvortsov, and Ajith Gunatilaka. 2016. A study of cognitive strategies for an autonomous search.Information Fusion28 (2016), 1–9
2016
-
[63]
Skvortsov, and Ajith H
Branko Ristic, Alexei T. Skvortsov, and Ajith H. Gunatilaka. 2016. A study of cognitive strategies for an autonomous search.Inf. Fusion28 (2016), 1–9. https://api.semanticscholar.org/CorpusID:1519176
2016
-
[64]
Adriana Romero, Nicolas Ballas, Samira Ebrahimi Kahou, Antoine Chassang, Carlo Gatta, and Yoshua Bengio. 2014. FitNets: Hints for Thin Deep Nets.CoRR abs/1412.6550 (2014). https://api.semanticscholar.org/CorpusID:2723173
work page internal anchor Pith review arXiv 2014
-
[65]
John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, and Oleg Klimov
-
[66]
Proximal Policy Optimization Algorithms
Proximal Policy Optimization Algorithms.ArXivabs/1707.06347 (2017). https://api.semanticscholar.org/CorpusID:28695052
work page internal anchor Pith review arXiv 2017
-
[67]
Jaemin Seo, Geunsik Bae, and Hyondong Oh. 2025. Kalman filter-based distributed Gaussian process for unknown scalar field estimation in wireless sensor networks. Expert Systems with Applications(2025), 127822
2025
-
[68]
Yiwei Shi, Muning Wen, Qi Zhang, Weinan Zhang, Cunjia Liu, and Weiru Liu
-
[69]
https: //api.semanticscholar.org/CorpusID:272689285
Autonomous Goal Detection and Cessation in Reinforcement Learning: A Case Study on Source Term Estimation.ArXivabs/2409.09541 (2024). https: //api.semanticscholar.org/CorpusID:272689285
-
[70]
Yiwei Shi, Mengyue Yang, Qi Zhang, Weinan Zhang, Cunjia Liu, and Weiru Liu
- [71]
-
[72]
Amarjeet Singh, Andreas Krause, Carlos Guestrin, William Kaiser, and Maxim Batalin. 2007. Efficient planning of informative paths for multiple robots. In Proceedings of the 20th International Joint Conference on Artifical Intelligence (Hyderabad, India)(IJCAI’07). Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 2204–2211
2007
-
[73]
John M. Stockie. 2011. The Mathematics of Atmospheric Dispersion Modeling. SIAM Rev.53 (2011), 349–372. https://api.semanticscholar.org/CorpusID:8186270
2011
-
[74]
John M Stockie. 2011. The mathematics of atmospheric dispersion modeling. Siam Review53, 2 (2011), 349–372
2011
-
[75]
Vinh Tran-Quang and Hung Dao-Viet. 2022. An internet of radiation sensor system (IoRSS) to detect radioactive sources out of regulatory control.Scientific Reports12, 1 (2022), 7195
2022
-
[76]
D B Turner. 1994. Workbook of atmospheric dispersion estimates : an introduction to dispersion modeling. https://api.semanticscholar.org/CorpusID:93715563
1994
-
[77]
Massimo Vergassola, Emmanuel Villermaux, and Boris I Shraiman. 2007. ‘In- fotaxis’ as a strategy for searching without gradients.Nature445, 7126 (2007), 406–409
2007
-
[78]
Alexander Sasha Vezhnevets, Simon Osindero, Tom Schaul, Nicolas Heess, Max Jaderberg, David Silver, and Koray Kavukcuoglu. 2017. Feudal networks for hier- archical reinforcement learning. InInternational conference on machine learning. PMLR, 3540–3549
2017
-
[79]
Lingxiao Wang and Shuo Pang. 2023. Robotic Odor Source Localization via End- to-End Recurrent Deep Reinforcement Learning.2023 Seventh IEEE International Conference on Robotic Computing (IRC)(2023), 43–50. https://api.semanticscholar. org/CorpusID:268707007
2023
-
[80]
Lingxiao Wang, Shuo Pang, and Jinlong Li. 2021. Olfactory-Based Navigation via Model-Based Reinforcement Learning and Fuzzy Inference Methods.IEEE Transactions on Fuzzy Systems29 (2021), 3014–3027. https://api.semanticscholar. org/CorpusID:226425926
2021
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.