pith. sign in

arxiv: 1811.05590 · v1 · pith:O3W533RDnew · submitted 2018-11-14 · 💻 cs.LG · cs.AI· stat.ML

Emergence of Addictive Behaviors in Reinforcement Learning Agents

classification 💻 cs.LG cs.AIstat.ML
keywords agentsaddictivewireheadinganalysisbehaviorsemergenceenvironmentfeasibility
0
0 comments X
read the original abstract

This paper presents a novel approach to the technical analysis of wireheading in intelligent agents. Inspired by the natural analogues of wireheading and their prevalent manifestations, we propose the modeling of such phenomenon in Reinforcement Learning (RL) agents as psychological disorders. In a preliminary step towards evaluating this proposal, we study the feasibility and dynamics of emergent addictive policies in Q-learning agents in the tractable environment of the game of Snake. We consider a slightly modified settings for this game, in which the environment provides a "drug" seed alongside the original "healthy" seed for the consumption of the snake. We adopt and extend an RL-based model of natural addiction to Q-learning agents in this settings, and derive sufficient parametric conditions for the emergence of addictive behaviors in such agents. Furthermore, we evaluate our theoretical analysis with three sets of simulation-based experiments. The results demonstrate the feasibility of addictive wireheading in RL agents, and provide promising venues of further research on the psychopathological modeling of complex AI safety problems.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Unexplainability and Incomprehensibility of Artificial Intelligence

    cs.CY 2019-06 unverdicted novelty 3.0

    Advanced AI systems are unexplainable in full and produce explanations that humans cannot comprehend.