pith. sign in

arxiv: 1712.04172 · v2 · pith:OMUT5TSCnew · submitted 2017-12-12 · 💻 cs.AI

A Low-Cost Ethics Shaping Approach for Designing Reinforcement Learning Agents

classification 💻 cs.AI
keywords ethicalhumanachieveagentslearninglow-costpolicyreinforcement
0
0 comments X
read the original abstract

This paper proposes a low-cost, easily realizable strategy to equip a reinforcement learning (RL) agent the capability of behaving ethically. Our model allows the designers of RL agents to solely focus on the task to achieve, without having to worry about the implementation of multiple trivial ethical patterns to follow. Based on the assumption that the majority of human behavior, regardless which goals they are achieving, is ethical, our design integrates human policy with the RL policy to achieve the target objective with less chance of violating the ethical code that human beings normally obey.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. The Alignment Target Problem: Divergent Moral Judgments of Humans, AI Systems, and Their Designers

    cs.CY 2026-04 unverdicted novelty 6.0

    Moral judgments become more deontological when human design of AI is visible, and designers are judged more strictly than the AI or unaided humans, creating plural and non-converging targets for value alignment.

  2. The Alignment Target Problem: Divergent Moral Judgments of Humans, AI Systems, and Their Designers

    cs.CY 2026-04 conditional novelty 5.0

    People judge AI systems and their human designers with markedly more deontological constraints than they apply to humans or standalone robots in the same ethical scenario.