Information Pursuit: A Bayesian Framework for Sequential Scene Parsing

Donald Geman; Ehsan Jahangiri; Erdem Yoruk; Laurent Younes; Rene Vidal

arxiv: 1701.02343 · v1 · pith:HZEYO4Q4new · submitted 2017-01-09 · 💻 cs.CV · cs.AI· stat.ML

Information Pursuit: A Bayesian Framework for Sequential Scene Parsing

Ehsan Jahangiri , Erdem Yoruk , Rene Vidal , Laurent Younes , Donald Geman This is my paper

classification 💻 cs.CV cs.AIstat.ML

keywords sceneframeworkinformationmodelannotatedanswerbayesianevidence

0 comments

read the original abstract

Despite enormous progress in object detection and classification, the problem of incorporating expected contextual relationships among object instances into modern recognition systems remains a key challenge. In this work we propose Information Pursuit, a Bayesian framework for scene parsing that combines prior models for the geometry of the scene and the spatial arrangement of objects instances with a data model for the output of high-level image classifiers trained to answer specific questions about the scene. In the proposed framework, the scene interpretation is progressively refined as evidence accumulates from the answers to a sequence of questions. At each step, we choose the question to maximize the mutual information between the new answer and the full interpretation given the current evidence obtained from previous inquiries. We also propose a method for learning the parameters of the model from synthesized, annotated scenes obtained by top-down sampling from an easy-to-learn generative scene model. Finally, we introduce a database of annotated indoor scenes of dining room tables, which we use to evaluate the proposed approach.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Variational Proximal Policy Optimization
stat.ML 2026-06 unverdicted novelty 5.0

VP2O maps PPO to SVGD in a MoE architecture using functional kernels and expert orthogonalization, claiming +179 ELO on Codeforces and 32% token reduction on AIME for a 33B/4B model.