Failure-Scenario Maker for Rule-Based Agent using Multi-agent Adversarial Reinforcement Learning and its Application to Autonomous Driving

arxiv: 1903.10654 · v3 · pith:47QLRFBYnew · submitted 2019-03-26 · 💻 cs.LG · cs.AI· cs.RO

Failure-Scenario Maker for Rule-Based Agent using Multi-agent Adversarial Reinforcement Learning and its Application to Autonomous Driving

Akifumi Wachi This is my paper

classification 💻 cs.LG cs.AIcs.RO

keywords failuremulti-agentrule-basedadversarialagentautonomousdrivinglearning

0 comments p. Extension

pith:47QLRFBY Add to your LaTeX paper

What is a Pith Number?

\usepackage{pith}
\pithnumber{47QLRFBY}

Prints a linked pith:47QLRFBY badge after your title and writes the identifier into PDF metadata. Compiles on arXiv with no extra files. Learn more

read the original abstract

We examine the problem of adversarial reinforcement learning for multi-agent domains including a rule-based agent. Rule-based algorithms are required in safety-critical applications for them to work properly in a wide range of situations. Hence, every effort is made to find failure scenarios during the development phase. However, as the software becomes complicated, finding failure cases becomes difficult. Especially in multi-agent domains, such as autonomous driving environments, it is much harder to find useful failure scenarios that help us improve the algorithm. We propose a method for efficiently finding failure scenarios; this method trains the adversarial agents using multi-agent reinforcement learning such that the tested rule-based agent fails. We demonstrate the effectiveness of our proposed method using a simple environment and autonomous driving simulator.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Steerable Adversarial Scenario Generation through Test-Time Preference Alignment
cs.AI 2025-09 unverdicted novelty 6.0

SAGE reframes adversarial scenario generation as multi-objective preference alignment, using hierarchical group-based optimization and test-time linear interpolation of two expert policies to enable steerable control ...