pith. sign in

arxiv: 1903.07752 · v1 · pith:TVDSBR2Anew · submitted 2019-03-18 · 🧮 math.OC

Incentive Design for Temporal Logic Objectives

classification 🧮 math.OC
keywords agentprincipalincentivesobjectiveoptimalbehaviordecisionincentive
0
0 comments X
read the original abstract

We study the problem of designing an optimal sequence of incentives that a principal should offer to an agent so that the agent's optimal behavior under the incentives realizes the principal's objective expressed as a temporal logic formula. We consider an agent with a finite decision horizon and model its decision-making process as a Markov decision process (MDP). Under certain assumptions, we present a polynomial-time algorithm to synthesize an incentive sequence that minimizes the cost to the principal. We show that if the underlying MDP has only deterministic transitions, the principal can hide its objective from the agent and still realize the desired behavior through incentives. On the other hand, an MDP with stochastic transitions may require the principal to share its objective with the agent. Finally, we demonstrate the proposed method in motion planning examples where a principal changes the optimal trajectory of an agent by providing incentives.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.