Repeated Deceptive Path Planning against Learnable Observer

Kaiqi Huang; Lei Cui; Likun Yang; Pei Xu; Shiyue Cao; Shiyu Zhang; Shizhao Yu; Xiaotang Chen; Yongjian Ren

arxiv: 2605.07174 · v1 · submitted 2026-05-08 · 💻 cs.AI

Repeated Deceptive Path Planning against Learnable Observer

Shiyue Cao , Pei Xu , Likun Yang , Lei Cui , Shizhao Yu , Shiyu Zhang , Yongjian Ren , Xiaotang Chen

show 1 more author

Kaiqi Huang

This is my paper

Pith reviewed 2026-05-11 00:58 UTC · model grok-4.3

classification 💻 cs.AI

keywords deceptive path planninglearnable observersrepeated interactionsmeta-planningadaptation lagmulti-agent systemsprivacy in navigation

0 comments

The pith

Deceptive Meta Planning uses cross-episode feedback to prevent adaptation lag against observers that learn destination predictions from past trajectories.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper frames deceptive path planning as a repeated game in which an agent must conceal its true goal while an observer improves its guesses by training on earlier paths. Standard methods either ignore the observer's learning or update too slowly, allowing prediction errors to compound across episodes. DeMP adds a meta layer that observes how the observer's model changed in prior episodes and optimizes the next policy to anticipate those changes. This two-level structure keeps deception effective longer than incremental adaptation alone. A sympathetic reader would care because many real settings, from secure transport to privacy-sensitive navigation, involve ongoing interactions with adaptive adversaries rather than one-shot static ones.

Core claim

Existing deceptive planners fail in repeated settings because incremental updates create accumulating lag relative to an observer that retrains on each new trajectory; DeMP counters this with episode-level policy adjustment to the latest observer model plus meta-level optimization over cross-episode feedback that learns the observer's update dynamics, yielding sustained deception success without sacrificing path cost.

What carries the argument

Deceptive Meta Planning (DeMP), a two-level optimization that performs short-term policy adaptation within each episode and meta-updates across episodes to model and preempt how the observer revises its destination predictions.

If this is right

Single-level deceptive planners lose effectiveness over repeated episodes as the observer's accuracy improves faster than the agent's responses.
Adding previous observer predictions to each update reduces but does not eliminate accumulating lag.
Meta-level optimization that explicitly tracks observer learning patterns restores sustained deception performance.
The approach preserves near-optimal path lengths while raising the observer's prediction error over multiple episodes.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same two-level structure could apply to other repeated privacy or security games in which one party must hide intent from a learner that retrains on observed actions.
If the observer's internal model is entirely inaccessible, meta-updates might still work by treating observed prediction errors as the sole training signal for anticipating future shifts.
Practical extensions could include combining DeMP with uncertainty estimates over possible observer learning rules to handle partial observability of the adversary's training data.

Load-bearing premise

Observer model updates are regular enough that cross-episode performance feedback alone can be used to accelerate the agent's future adaptations without direct access to the observer's parameters or learning rule.

What would settle it

A controlled test in which the observer switches to a completely different learning rule each episode, such that meta-updates trained on prior episodes produce no measurable reduction in deception lag compared with plain incremental adaptation.

Figures

Figures reproduced from arXiv: 2605.07174 by Kaiqi Huang, Lei Cui, Likun Yang, Pei Xu, Shiyue Cao, Shiyu Zhang, Shizhao Yu, Xiaotang Chen, Yongjian Ren.

**Figure 2.** Figure 2: Trajectory Evolution of DeMP in RDPP. The left upper panel shows the static trajectory of the baseline method [PITH_FULL_IMAGE:figures/full_fig_p002_2.png] view at source ↗

**Figure 3.** Figure 3: The two-level optimization framework of DeMP. The process is structured into two levels: (1) The Episode-Level [PITH_FULL_IMAGE:figures/full_fig_p003_3.png] view at source ↗

**Figure 4.** Figure 4: Deception performance and trajectory cost in repeated interactions. The first row corresponds to the [PITH_FULL_IMAGE:figures/full_fig_p005_4.png] view at source ↗

**Figure 5.** Figure 5: t-SNE Projection of Path Features. The figure com [PITH_FULL_IMAGE:figures/full_fig_p006_5.png] view at source ↗

**Figure 6.** Figure 6: Analysis of DeMP under repeated deceptive plan [PITH_FULL_IMAGE:figures/full_fig_p007_6.png] view at source ↗

**Figure 1.** Figure 1: Observer pretraining dynamics and predictive performance. (a–c) Predicted probability of the true [PITH_FULL_IMAGE:figures/full_fig_p013_1.png] view at source ↗

read the original abstract

We study the problem of deceptive path planning (DPP), where an agent aims to conceal its true destination from external observers. While existing work assumes static, non-learning observers, real-world adversaries-such as in critical goods transportation or military operations-can adapt by learning from historical trajectories. To address this gap, we introduce Repeated Deceptive Path Planning (RDPP), a new formulation that explicitly models learnable observers. We show that existing DPP methods fail under this setting, as they cannot adapt to evolving adversarial predictions. While incorporating observer previous predictions into updates enables some adaptation, such incremental updates cause accumulative lag that degrades deception. To this end, we propose Deceptive Meta Planning (DeMP), a two-level optimization framework that combines episode-level adaptation, which enables short-term policy adjustment to counter updated observer, and meta-level updates, which leverage cross-episode feedback to capture how observers update their models and accelerate adaptation in future episodes. In this way, DeMP mitigates the accumulation of adaptation lag, enabling sustained deception against a learning observer. Experiments across environments demonstrate that DeMP significantly outperforms existing approaches in RDPP while maintaining competitive path cost. Our results highlight the importance of modeling repeated interactions with learnable adversaries, providing new insights into deception and privacy in multi-agent systems.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

RDPP and DeMP formalize deceptive planning against learning observers and add a meta-level to cut adaptation lag, but the abstract gives almost no experimental detail or analysis to back the central claim.

read the letter

The paper's main move is to define repeated deceptive path planning where the observer updates its model from past trajectories, then show that plain DPP and simple incremental fixes fall behind because of lag. DeMP tries to fix that with episode-level adaptation plus meta-updates that use cross-episode data to anticipate how the observer will change next time. That framing is new relative to the static-observer DPP work they cite, and it lines up with real settings like repeated security or logistics tasks where the adversary can learn. The two-level split is a straightforward way to separate short-term reaction from longer-term pattern capture. The abstract claims DeMP keeps deception effective while path costs stay competitive, which is the sort of practical result that could matter for applications. The weak part is the evidence. No description of the observer's learning rule, the test environments, the baselines, or any statistical checks appears in the abstract, and the full text does not seem to supply identifiability arguments or stress tests for non-stationary observers. The stress-test point holds: if the meta-optimizer has to infer update dynamics from trajectories alone without knowing the rule or having direct access, it is not obvious when that inference stays reliable. Without bounds or ablation on changing observer behavior, the lag-mitigation benefit could disappear outside the training distribution. This is the kind of paper that would interest people working on planning with adaptive adversaries or privacy in multi-agent systems. A reader already familiar with DPP would see the extension clearly and might pick up the meta-planning idea for their own setting. It is coherent on its own terms and shows honest engagement with the limitation of prior work, so it clears the bar for a serious referee even though the current version would need stronger experiments and some analysis of when the meta-updates actually help.

Referee Report

2 major / 2 minor

Summary. The paper introduces Repeated Deceptive Path Planning (RDPP) to model deceptive path planning against observers that learn and adapt from historical trajectories, unlike prior work assuming static observers. It shows that existing DPP methods and simple incremental updates suffer from accumulating adaptation lag. The proposed Deceptive Meta Planning (DeMP) uses a two-level optimization: episode-level adaptation for short-term policy adjustment against the current observer model, plus meta-level updates that leverage cross-episode trajectory feedback to infer and accelerate response to how the observer revises its predictions. Experiments across environments are claimed to show DeMP outperforming baselines in deception effectiveness while keeping competitive path costs.

Significance. If the empirical claims hold under rigorous validation, the work fills a gap by explicitly modeling repeated interactions with adaptive adversaries in deception planning. The two-level meta-optimization approach offers a concrete way to sustain deception without direct observer model access, with potential relevance to privacy and security applications in multi-agent systems.

major comments (2)

[Abstract] Abstract: the claim that 'experiments across environments demonstrate that DeMP significantly outperforms existing approaches' provides no information on baselines, metrics (e.g., deception success rate, path cost), environments, number of trials, or statistical tests. This absence makes it impossible to evaluate support for the central claim that DeMP mitigates adaptation lag.
[DeMP framework] DeMP framework description (likely §3 or §4): the meta-level component is asserted to 'capture how observers update their models' from cross-episode trajectories alone, without direct access or knowledge of the learning rule. No identifiability result, convergence bound, or analysis is given for cases where the observer update is non-stationary, high-dimensional, or outside the meta-training distribution; this assumption is load-bearing for the claimed lag-mitigation benefit.

minor comments (2)

[Method] Clarify the exact form of the meta-update rule with explicit equations or pseudocode, including how trajectory histories are encoded and what loss is optimized at the meta level.
[Discussion] Add a limitations or assumptions subsection discussing when the meta-optimizer may fail (e.g., non-stationary observer rules).

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive feedback, which helps clarify the presentation of our contributions on repeated deceptive path planning. We address each major comment below and indicate the revisions made to the manuscript.

read point-by-point responses

Referee: [Abstract] Abstract: the claim that 'experiments across environments demonstrate that DeMP significantly outperforms existing approaches' provides no information on baselines, metrics (e.g., deception success rate, path cost), environments, number of trials, or statistical tests. This absence makes it impossible to evaluate support for the central claim that DeMP mitigates adaptation lag.

Authors: We agree that the abstract would benefit from additional context to allow readers to assess the empirical support for the claims. In the revised manuscript, we have expanded the abstract to briefly specify the baselines (standard DPP and incremental-update methods), primary metrics (deception success rate measured by observer prediction error on the true goal, together with path cost), environments (discrete grid navigation and continuous control tasks), trial count (50 independent runs per setting), and use of statistical tests (paired t-tests, p < 0.05). These details are already reported in full in §5; the abstract update provides the necessary framing without exceeding length constraints. revision: yes
Referee: [DeMP framework] DeMP framework description (likely §3 or §4): the meta-level component is asserted to 'capture how observers update their models' from cross-episode trajectories alone, without direct access or knowledge of the learning rule. No identifiability result, convergence bound, or analysis is given for cases where the observer update is non-stationary, high-dimensional, or outside the meta-training distribution; this assumption is load-bearing for the claimed lag-mitigation benefit.

Authors: The meta-level optimizer is trained to infer observer model changes solely from sequences of observed trajectories, without access to the observer's internal update rule or parameters. We do not claim universal identifiability or provide convergence bounds for arbitrary non-stationary, high-dimensional, or out-of-distribution observer updates; the framework relies on the meta-training distribution covering representative observer behaviors. In the revision we have added a dedicated limitations paragraph in §4.3 that explicitly states these scope conditions, reports additional experiments with varying observer learning rates and initial model mismatches, and notes that performance may degrade for observers whose update dynamics lie far outside the meta-training support. This clarifies the empirical basis for the lag-mitigation result while acknowledging the theoretical gap. revision: partial

Circularity Check

0 steps flagged

No significant circularity; DeMP is a new two-level optimization framework with independent empirical claims

full rationale

The paper defines RDPP as a new problem setting with learnable observers, demonstrates failure of prior DPP methods via incremental updates, and introduces DeMP as episode-level adaptation plus meta-level cross-episode feedback. These steps are presented as algorithmic design choices supported by experiments across environments, without reducing any prediction or central result to a fitted parameter, self-definition, or self-citation chain. No equations or uniqueness theorems are invoked that collapse back to the inputs by construction. The derivation remains self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The central claim rests on the assumption that observers learn from trajectories in a way that can be countered via meta-updates; no free parameters or invented physical entities are introduced.

axioms (1)

domain assumption Observers update their predictive models based on historical trajectories in a learnable manner.
Invoked to define the RDPP setting and explain why incremental updates lag.

invented entities (1)

Deceptive Meta Planning (DeMP) two-level optimization framework no independent evidence
purpose: To combine episode-level policy adjustment with meta-level learning of observer updates.
Newly proposed construct without independent evidence outside the paper.

pith-pipeline@v0.9.0 · 5545 in / 1212 out tokens · 57350 ms · 2026-05-11T00:58:02.842284+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

DeMP ... two-level optimization framework that combines episode-level adaptation ... and meta-level updates, which leverage cross-episode feedback to capture how observers update their models
IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Theorem 4.2 (Anticipation Mechanism ... meta-gradient ... second-order correction that reduces sensitivity to the observer’s learning dynamics

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

51 extracted references · 51 canonical work pages

[1]

, volume =

Techniques for plan recognition. , volume =. User Modeling and User-Adapted Interaction , author =. 2001 , keywords =. doi:10.1023/A:1011118925938 , abstract =

work page doi:10.1023/a:1011118925938 2001
[2]

Proceedings of the AAAI Conference on Artificial Intelligence , author =

Goal. Proceedings of the AAAI Conference on Artificial Intelligence , author =. 2022 , note =. doi:10.1609/aaai.v36i9.21198 , number =

work page doi:10.1609/aaai.v36i9.21198 2022
[3]

Classification of Partial Discharges Originating From Multilevel PWM Using Machine Learning,

Adversarial. IEEE Robotics and Automation Letters , author =. 2022 , note =. doi:10.1109/LRA.2022.3148464 , abstract =

work page doi:10.1109/lra.2022.3148464 2022
[4]

Amado, Leonardo and Fraga Pereira, Ramon and Meneguzzi, Felipe , month = jun, year =. Robust. Proceedings of the. doi:10.1609/aaai.v37i10.26408 , abstract =

work page doi:10.1609/aaai.v37i10.26408
[5]

New Zealand , author =

Fast and. New Zealand , author =. 2024 , keywords =

work page 2024
[6]

Proceedings of the International Conference on Automated Planning and Scheduling , author =

Goal. Proceedings of the International Conference on Automated Planning and Scheduling , author =. 2023 , keywords =. doi:10.1609/icaps.v33i1.27224 , abstract =

work page doi:10.1609/icaps.v33i1.27224 2023
[7]

and Putelli, Luca and Percassi, Francesco and Serina, Ivan , month = oct, year =

Chiari, Mattia and Gerevini, Alfonso E. and Putelli, Luca and Percassi, Francesco and Serina, Ivan , month = oct, year =. Goal. Proceedings of the

work page
[8]

Front Robot AI , author =

Activity,. Front Robot AI , author =. 2021 , pmcid =. doi:10.3389/frobt.2021.643010 , abstract =

work page doi:10.3389/frobt.2021.643010 2021
[9]

Journal of Artificial Intelligence Research , author =

Cost-based goal recognition in navigational domains , volume =. Journal of Artificial Intelligence Research , author =. 2019 , note =. doi:10.1613/jair.1.11343 , abstract =

work page doi:10.1613/jair.1.11343 2019
[10]

Masters, Peta and Sardina, Sebastian , month = may, year =. Cost-. Proceedings of the 16th

work page
[11]

The Fourth Annual Conference on Advances in Cognitive Systems , volume=

Online goal recognition through mirroring: Humans and agents , author=. The Fourth Annual Conference on Advances in Cognitive Systems , volume=

work page
[12]

Proceedings of the AAAI Conference on Artificial Intelligence , author =

Plan. Proceedings of the AAAI Conference on Artificial Intelligence , author =. 2018 , keywords =. doi:10.1609/aaai.v32i1.12097 , abstract =

work page doi:10.1609/aaai.v32i1.12097 2018
[13]

2019 , school=

Goal recognition and deception in path-planning , author=. 2019 , school=

work page 2019
[14]

Advances in Cognitive Systems , author =

Online goal recognition through mirroring\_. Advances in Cognitive Systems , author =. 2016 , keywords =

work page 2016
[15]

, month = aug, year =

Vered, Mor and Kaminka, Gal A. , month = aug, year =. Heuristic. Proceedings of the. doi:10.24963/ijcai.2017/621 , language =

work page doi:10.24963/ijcai.2017/621 2017
[16]

Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems , pages=

Goal recognition for rational and irrational agents , author=. Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems , pages=

work page
[17]

, author=

Deceptive Path-Planning. , author=. IJCAI , pages=

work page
[18]

Proceedings of the 20th International Conference on Autonomous Agents and MultiAgent Systems , pages=

Extended goal recognition: a planning-based model for strategic deception , author=. Proceedings of the 20th International Conference on Autonomous Agents and MultiAgent Systems , pages=

work page
[19]

International Journal of Intelligence and CounterIntelligence , year=

Toward a Theory of Deception , author=. International Journal of Intelligence and CounterIntelligence , year=

work page
[20]

The Journal of Strategic Studies , volume=

Toward a general theory of deception , author=. The Journal of Strategic Studies , volume=. 1982 , publisher=

work page 1982
[21]

Proceedings of the 21st international joint conference on Artifical intelligence

Plan recognition as planning , author=. Proceedings of the 21st international joint conference on Artifical intelligence. Morgan Kaufmann Publishers Inc , pages=. 2009 , organization=

work page 2009
[22]

Proceedings of the AAAI conference on artificial intelligence , volume=

Probabilistic plan recognition using off-the-shelf classical planners , author=. Proceedings of the AAAI conference on artificial intelligence , volume=

work page
[23]

, author=

Plan Recognition as Planning Revisited. , author=. IJCAI , pages=. 2016 , organization=

work page 2016
[24]

International Joint Conference on Artificial Intelligence , year=

A Survey on Goal Recognition as Planning , author=. International Joint Conference on Artificial Intelligence , year=

work page
[25]

Proceedings of the International Conference on Automated Planning and Scheduling , author =

Deceptive. Proceedings of the International Conference on Automated Planning and Scheduling , author =. 2023 , pages =. doi:10.1609/icaps.v33i1.27240 , number =

work page doi:10.1609/icaps.v33i1.27240 2023
[26]

Adaptive Agents and Multi-Agent Systems , year=

Deceptive Reinforcement Learning for Privacy-Preserving Planning , author=. Adaptive Agents and Multi-Agent Systems , year=

work page
[27]

Proceedings of the AAAI Conference on Artificial Intelligence , volume=

Deceptive decision-making under uncertainty , author=. Proceedings of the AAAI Conference on Artificial Intelligence , volume=

work page
[28]

arXiv preprint arXiv:2402.06552 , year=

Deceptive Path Planning via Reinforcement Learning with Graph Neural Networks , author=. arXiv preprint arXiv:2402.06552 , year=

work page arXiv
[29]

Proceedings of the 22nd Brazilian Symposium on Games and Digital Entertainment , pages=

Deceptive Topographic Path Planning , author=. Proceedings of the 22nd Brazilian Symposium on Games and Digital Entertainment , pages=

work page
[30]

Entropy , volume=

Single real goal, magnitude-based deceptive path-planning , author=. Entropy , volume=. 2020 , publisher=

work page 2020
[31]

, author=

Domain-Independent Deceptive Planning. , author=. AAMAS , pages=

work page
[32]

2018 56th Annual Allerton Conference on Communication, Control, and Computing (Allerton) , pages=

Deception in optimal control , author=. 2018 56th Annual Allerton Conference on Communication, Control, and Computing (Allerton) , pages=. 2018 , organization=

work page 2018
[33]

2014 , publisher=

Markov decision processes: discrete stochastic dynamic programming , author=. 2014 , publisher=

work page 2014
[34]

Advances in neural information processing systems , volume=

Online bayesian goal inference for boundedly rational planning agents , author=. Advances in neural information processing systems , volume=

work page
[35]

2018 International Joint Conference on Neural Networks (IJCNN) , pages=

Goal recognition in latent space , author=. 2018 International Joint Conference on Neural Networks (IJCNN) , pages=. 2018 , organization=

work page 2018
[36]

International Conference on Mechanism and Machine Science , pages=

Path Planning and Information Protection of Mobile Robots Based on Deceptive Reinforcement Learning , author=. International Conference on Mechanism and Machine Science , pages=. 2022 , organization=

work page 2022
[37]

Electronics , volume=

Opponent-aware planning with admissible privacy preserving for UGV security patrol under contested environment , author=. Electronics , volume=. 2019 , publisher=

work page 2019
[38]

International conference on machine learning , pages=

Model-agnostic meta-learning for fast adaptation of deep networks , author=. International conference on machine learning , pages=. 2017 , organization=

work page 2017
[39]

IEEE Robotics and Automation Letters , volume=

Adversarial sampling-based motion planning , author=. IEEE Robotics and Automation Letters , volume=. 2022 , publisher=

work page 2022
[40]

International conference on machine learning , pages=

Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor , author=. International conference on machine learning , pages=. 2018 , organization=

work page 2018
[41]

, author=

An Analysis of Deceptive Robot Motion. , author=. Robotics: science and systems , pages=

work page
[42]

IEEE Transactions on Automation Science and Engineering , volume=

A dynamic game framework for rational and persistent robot deception with an application to deceptive pursuit-evasion , author=. IEEE Transactions on Automation Science and Engineering , volume=. 2021 , publisher=

work page 2021
[43]

International conference on decision and game theory for security , pages=

Deceptive reinforcement learning under adversarial manipulations on cost signals , author=. International conference on decision and game theory for security , pages=. 2019 , organization=

work page 2019
[44]

2019 IEEE 58th Conference on Decision and Control (CDC) , pages=

Optimal deceptive and reference policies for supervisory control , author=. 2019 IEEE 58th Conference on Decision and Control (CDC) , pages=. 2019 , organization=

work page 2019
[45]

arXiv preprint arXiv:2306.03877 , year=

The Eater and the Mover Game , author=. arXiv preprint arXiv:2306.03877 , year=

work page arXiv
[46]

AI Communications , volume=

Modelling deception using theory of mind in multi-agent systems , author=. AI Communications , volume=. 2019 , publisher=

work page 2019
[47]

2024 American Control Conference (ACC) , pages=

Deceptive planning for resource allocation , author=. 2024 American Control Conference (ACC) , pages=. 2024 , organization=

work page 2024
[48]

International Journal of Social Robotics , volume=

Acting deceptively: Providing robots with the capacity for deception , author=. International Journal of Social Robotics , volume=. 2011 , publisher=

work page 2011
[49]

Engineering Applications of Artificial Intelligence , volume=

Agent deception via polynomial path planning , author=. Engineering Applications of Artificial Intelligence , volume=. 2025 , publisher=

work page 2025
[50]

Advances in Neural Information Processing Systems , volume=

Minigrid & miniworld: Modular & customizable reinforcement learning environments for goal-oriented tasks , author=. Advances in Neural Information Processing Systems , volume=

work page
[51]

PROCEEDINGS-INTERNATIONAL CONFERENCE ON PRINCIPLES OF KNOWLEDGE REPRESENTATION AND REASONING , volume=

An optimization approach to robust goal obfuscation , author=. PROCEEDINGS-INTERNATIONAL CONFERENCE ON PRINCIPLES OF KNOWLEDGE REPRESENTATION AND REASONING , volume=. 2020 , organization=

work page 2020

[1] [1]

, volume =

Techniques for plan recognition. , volume =. User Modeling and User-Adapted Interaction , author =. 2001 , keywords =. doi:10.1023/A:1011118925938 , abstract =

work page doi:10.1023/a:1011118925938 2001

[2] [2]

Proceedings of the AAAI Conference on Artificial Intelligence , author =

Goal. Proceedings of the AAAI Conference on Artificial Intelligence , author =. 2022 , note =. doi:10.1609/aaai.v36i9.21198 , number =

work page doi:10.1609/aaai.v36i9.21198 2022

[3] [3]

Classification of Partial Discharges Originating From Multilevel PWM Using Machine Learning,

Adversarial. IEEE Robotics and Automation Letters , author =. 2022 , note =. doi:10.1109/LRA.2022.3148464 , abstract =

work page doi:10.1109/lra.2022.3148464 2022

[4] [4]

Amado, Leonardo and Fraga Pereira, Ramon and Meneguzzi, Felipe , month = jun, year =. Robust. Proceedings of the. doi:10.1609/aaai.v37i10.26408 , abstract =

work page doi:10.1609/aaai.v37i10.26408

[5] [5]

New Zealand , author =

Fast and. New Zealand , author =. 2024 , keywords =

work page 2024

[6] [6]

Proceedings of the International Conference on Automated Planning and Scheduling , author =

Goal. Proceedings of the International Conference on Automated Planning and Scheduling , author =. 2023 , keywords =. doi:10.1609/icaps.v33i1.27224 , abstract =

work page doi:10.1609/icaps.v33i1.27224 2023

[7] [7]

and Putelli, Luca and Percassi, Francesco and Serina, Ivan , month = oct, year =

Chiari, Mattia and Gerevini, Alfonso E. and Putelli, Luca and Percassi, Francesco and Serina, Ivan , month = oct, year =. Goal. Proceedings of the

work page

[8] [8]

Front Robot AI , author =

Activity,. Front Robot AI , author =. 2021 , pmcid =. doi:10.3389/frobt.2021.643010 , abstract =

work page doi:10.3389/frobt.2021.643010 2021

[9] [9]

Journal of Artificial Intelligence Research , author =

Cost-based goal recognition in navigational domains , volume =. Journal of Artificial Intelligence Research , author =. 2019 , note =. doi:10.1613/jair.1.11343 , abstract =

work page doi:10.1613/jair.1.11343 2019

[10] [10]

Masters, Peta and Sardina, Sebastian , month = may, year =. Cost-. Proceedings of the 16th

work page

[11] [11]

The Fourth Annual Conference on Advances in Cognitive Systems , volume=

Online goal recognition through mirroring: Humans and agents , author=. The Fourth Annual Conference on Advances in Cognitive Systems , volume=

work page

[12] [12]

Proceedings of the AAAI Conference on Artificial Intelligence , author =

Plan. Proceedings of the AAAI Conference on Artificial Intelligence , author =. 2018 , keywords =. doi:10.1609/aaai.v32i1.12097 , abstract =

work page doi:10.1609/aaai.v32i1.12097 2018

[13] [13]

2019 , school=

Goal recognition and deception in path-planning , author=. 2019 , school=

work page 2019

[14] [14]

Advances in Cognitive Systems , author =

Online goal recognition through mirroring\_. Advances in Cognitive Systems , author =. 2016 , keywords =

work page 2016

[15] [15]

, month = aug, year =

Vered, Mor and Kaminka, Gal A. , month = aug, year =. Heuristic. Proceedings of the. doi:10.24963/ijcai.2017/621 , language =

work page doi:10.24963/ijcai.2017/621 2017

[16] [16]

Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems , pages=

Goal recognition for rational and irrational agents , author=. Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems , pages=

work page

[17] [17]

, author=

Deceptive Path-Planning. , author=. IJCAI , pages=

work page

[18] [18]

Proceedings of the 20th International Conference on Autonomous Agents and MultiAgent Systems , pages=

Extended goal recognition: a planning-based model for strategic deception , author=. Proceedings of the 20th International Conference on Autonomous Agents and MultiAgent Systems , pages=

work page

[19] [19]

International Journal of Intelligence and CounterIntelligence , year=

Toward a Theory of Deception , author=. International Journal of Intelligence and CounterIntelligence , year=

work page

[20] [20]

The Journal of Strategic Studies , volume=

Toward a general theory of deception , author=. The Journal of Strategic Studies , volume=. 1982 , publisher=

work page 1982

[21] [21]

Proceedings of the 21st international joint conference on Artifical intelligence

Plan recognition as planning , author=. Proceedings of the 21st international joint conference on Artifical intelligence. Morgan Kaufmann Publishers Inc , pages=. 2009 , organization=

work page 2009

[22] [22]

Proceedings of the AAAI conference on artificial intelligence , volume=

Probabilistic plan recognition using off-the-shelf classical planners , author=. Proceedings of the AAAI conference on artificial intelligence , volume=

work page

[23] [23]

, author=

Plan Recognition as Planning Revisited. , author=. IJCAI , pages=. 2016 , organization=

work page 2016

[24] [24]

International Joint Conference on Artificial Intelligence , year=

A Survey on Goal Recognition as Planning , author=. International Joint Conference on Artificial Intelligence , year=

work page

[25] [25]

Proceedings of the International Conference on Automated Planning and Scheduling , author =

Deceptive. Proceedings of the International Conference on Automated Planning and Scheduling , author =. 2023 , pages =. doi:10.1609/icaps.v33i1.27240 , number =

work page doi:10.1609/icaps.v33i1.27240 2023

[26] [26]

Adaptive Agents and Multi-Agent Systems , year=

Deceptive Reinforcement Learning for Privacy-Preserving Planning , author=. Adaptive Agents and Multi-Agent Systems , year=

work page

[27] [27]

Proceedings of the AAAI Conference on Artificial Intelligence , volume=

Deceptive decision-making under uncertainty , author=. Proceedings of the AAAI Conference on Artificial Intelligence , volume=

work page

[28] [28]

arXiv preprint arXiv:2402.06552 , year=

Deceptive Path Planning via Reinforcement Learning with Graph Neural Networks , author=. arXiv preprint arXiv:2402.06552 , year=

work page arXiv

[29] [29]

Proceedings of the 22nd Brazilian Symposium on Games and Digital Entertainment , pages=

Deceptive Topographic Path Planning , author=. Proceedings of the 22nd Brazilian Symposium on Games and Digital Entertainment , pages=

work page

[30] [30]

Entropy , volume=

Single real goal, magnitude-based deceptive path-planning , author=. Entropy , volume=. 2020 , publisher=

work page 2020

[31] [31]

, author=

Domain-Independent Deceptive Planning. , author=. AAMAS , pages=

work page

[32] [32]

2018 56th Annual Allerton Conference on Communication, Control, and Computing (Allerton) , pages=

Deception in optimal control , author=. 2018 56th Annual Allerton Conference on Communication, Control, and Computing (Allerton) , pages=. 2018 , organization=

work page 2018

[33] [33]

2014 , publisher=

Markov decision processes: discrete stochastic dynamic programming , author=. 2014 , publisher=

work page 2014

[34] [34]

Advances in neural information processing systems , volume=

Online bayesian goal inference for boundedly rational planning agents , author=. Advances in neural information processing systems , volume=

work page

[35] [35]

2018 International Joint Conference on Neural Networks (IJCNN) , pages=

Goal recognition in latent space , author=. 2018 International Joint Conference on Neural Networks (IJCNN) , pages=. 2018 , organization=

work page 2018

[36] [36]

International Conference on Mechanism and Machine Science , pages=

Path Planning and Information Protection of Mobile Robots Based on Deceptive Reinforcement Learning , author=. International Conference on Mechanism and Machine Science , pages=. 2022 , organization=

work page 2022

[37] [37]

Electronics , volume=

Opponent-aware planning with admissible privacy preserving for UGV security patrol under contested environment , author=. Electronics , volume=. 2019 , publisher=

work page 2019

[38] [38]

International conference on machine learning , pages=

Model-agnostic meta-learning for fast adaptation of deep networks , author=. International conference on machine learning , pages=. 2017 , organization=

work page 2017

[39] [39]

IEEE Robotics and Automation Letters , volume=

Adversarial sampling-based motion planning , author=. IEEE Robotics and Automation Letters , volume=. 2022 , publisher=

work page 2022

[40] [40]

International conference on machine learning , pages=

Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor , author=. International conference on machine learning , pages=. 2018 , organization=

work page 2018

[41] [41]

, author=

An Analysis of Deceptive Robot Motion. , author=. Robotics: science and systems , pages=

work page

[42] [42]

IEEE Transactions on Automation Science and Engineering , volume=

A dynamic game framework for rational and persistent robot deception with an application to deceptive pursuit-evasion , author=. IEEE Transactions on Automation Science and Engineering , volume=. 2021 , publisher=

work page 2021

[43] [43]

International conference on decision and game theory for security , pages=

Deceptive reinforcement learning under adversarial manipulations on cost signals , author=. International conference on decision and game theory for security , pages=. 2019 , organization=

work page 2019

[44] [44]

2019 IEEE 58th Conference on Decision and Control (CDC) , pages=

Optimal deceptive and reference policies for supervisory control , author=. 2019 IEEE 58th Conference on Decision and Control (CDC) , pages=. 2019 , organization=

work page 2019

[45] [45]

arXiv preprint arXiv:2306.03877 , year=

The Eater and the Mover Game , author=. arXiv preprint arXiv:2306.03877 , year=

work page arXiv

[46] [46]

AI Communications , volume=

Modelling deception using theory of mind in multi-agent systems , author=. AI Communications , volume=. 2019 , publisher=

work page 2019

[47] [47]

2024 American Control Conference (ACC) , pages=

Deceptive planning for resource allocation , author=. 2024 American Control Conference (ACC) , pages=. 2024 , organization=

work page 2024

[48] [48]

International Journal of Social Robotics , volume=

Acting deceptively: Providing robots with the capacity for deception , author=. International Journal of Social Robotics , volume=. 2011 , publisher=

work page 2011

[49] [49]

Engineering Applications of Artificial Intelligence , volume=

Agent deception via polynomial path planning , author=. Engineering Applications of Artificial Intelligence , volume=. 2025 , publisher=

work page 2025

[50] [50]

Advances in Neural Information Processing Systems , volume=

Minigrid & miniworld: Modular & customizable reinforcement learning environments for goal-oriented tasks , author=. Advances in Neural Information Processing Systems , volume=

work page

[51] [51]

PROCEEDINGS-INTERNATIONAL CONFERENCE ON PRINCIPLES OF KNOWLEDGE REPRESENTATION AND REASONING , volume=

An optimization approach to robust goal obfuscation , author=. PROCEEDINGS-INTERNATIONAL CONFERENCE ON PRINCIPLES OF KNOWLEDGE REPRESENTATION AND REASONING , volume=. 2020 , organization=

work page 2020