A Reinforcement Learning Approach to the View Planning Problem

Mustafa Devrim Kaba; Mustafa Gokhan Uzunbas; Ser Nam Lim

arxiv: 1610.06204 · v2 · pith:AF6Q2G6Vnew · submitted 2016-10-19 · 💻 cs.CV

A Reinforcement Learning Approach to the View Planning Problem

Mustafa Devrim Kaba , Mustafa Gokhan Uzunbas , Ser Nam Lim This is my paper

classification 💻 cs.CV

keywords viewalgorithmfunctiongreedymodelproblemapproachapproximation

0 comments

read the original abstract

We present a Reinforcement Learning (RL) solution to the view planning problem (VPP), which generates a sequence of view points that are capable of sensing all accessible area of a given object represented as a 3D model. In doing so, the goal is to minimize the number of view points, making the VPP a class of set covering optimization problem (SCOP). The SCOP is NP-hard, and the inapproximability results tell us that the greedy algorithm provides the best approximation that runs in polynomial time. In order to find a solution that is better than the greedy algorithm, (i) we introduce a novel score function by exploiting the geometry of the 3D model, (ii) we model an intuitive human approach to VPP using this score function, and (iii) we cast VPP as a Markovian Decision Process (MDP), and solve the MDP in RL framework using well-known RL algorithms. In particular, we use SARSA, Watkins-Q and TD with function approximation to solve the MDP. We compare the results of our method with the baseline greedy algorithm in an extensive set of test objects, and show that we can out-perform the baseline in almost all cases.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Learning to Place Guards by Reinforcement: A Geo-Free Neural Policy for the Vertex-Guard Art Gallery Problem
cs.LG 2026-06 unverdicted novelty 7.0

A reinforcement learning policy for the vertex-guard art gallery problem encodes sufficient geometric information in its encoder to allow a simple classifier to achieve high coverage feasibility out of distribution.