Semantic Parsing for Priming Object Detection in RGB-D Scenes

Semantic Parsing for Priming Object Detection in RGB-D Scenes

Semantic Parsing for Priming Object Detection in RGB-D Scenes Cesar Cadena and Jana Kosecka 3rd Workshop On Semantic Perception, Mapping and Exploration (SPME) Karlsruhe, Germany ,2013 Motivation Long-term robotic operation The semantic information about the surrounding environment is important for high level robotic tasks.

It is difficult to know a priori all the possible instances or classes of objects that the robot will find in a real operation. Even if we know a lot of them, it is unreasonable and expensive, run all specific object detectors Semantic Parsing for Priming Object 5/5/2013 at the same time. Detection in RGB-D Scenes Motivation Long-term robotic operation

The semantic information about the surrounding environment is important for high level robotic tasks. It is difficult to know a priori all the possible instances or classes of objects that the robot will find in a real operation. Even if we know a lot of them, it is unreasonable and expensive, run all specific object detectors Semantic Parsing for Priming Object 5/5/2013 at the same time.

Detection in RGB-D Scenes Motivation Long-term robotic operation The semantic information about the surrounding environment is important for high level robotic tasks. It is difficult to know a priori all the possible instances or classes of objects that the robot will find in a real operation.

Even if we know a lot of them, it is unreasonable and expensive, run all specific object detectors Semantic Parsing for Priming Object 5/5/2013 at the same time. Detection in RGB-D Scenes Motivation Long-term robotic operation The semantic information about the surrounding environment is important for high level robotic tasks.

It is difficult to know a priori all the possible instances or classes of objects that the robot will find in a real operation. Even if we know a lot of them, it is unreasonable and expensive, run all specific object detectors Semantic Parsing for Priming Object 5/5/2013 at the same time. Detection in RGB-D Scenes Motivation However:

There are things we can assume to be present (almost) always Generic detachable objects also share some characteristics Urban: Ground Indoors:Ground Today: Buildings Sky Walls Ceiling Objects Objects

Ground Structure Furniture Props Efficiently to segment RGB+3D scenes into these general classes to be used as a prior for specific task detectors Semantic Parsing for Priming Object Detection in RGB-D Scenes 5/5/2013 Motivation However:

There are things we can assume to be present (almost) always Generic detachable objects also share some characteristics Urban: Ground Indoors:Ground Today: Buildings Sky Walls Ceiling Objects Objects Ground Structure Furniture Props Efficiently to segment RGB+3D scenes into these general

classes to be used as a prior for specific task detectors Semantic Parsing for Priming Object Detection in RGB-D Scenes 5/5/2013 Motivation However: There are things we can assume to be present (almost) always Generic detachable objects also share some characteristics

Urban: Ground Indoors:Ground Today: Buildings Sky Walls Ceiling Objects Objects Ground Structure Furniture Props Efficiently to segment RGB+3D scenes into these general classes to be used as a prior for specific task detectors Semantic Parsing for Priming Object Detection in RGB-D Scenes

5/5/2013 Motivation However: There are things we can assume to be present (almost) always Generic detachable objects also share some characteristics Urban: Ground Indoors:Ground Today:

Buildings Sky Walls Ceiling Objects Objects Ground Structure Furniture Props Efficiently to segment RGB+3D scenes into these general classes to be used as a prior for specific task detectors Semantic Parsing for Priming Object Detection in RGB-D Scenes 5/5/2013 Our Problem

However: There are things we can assume to be present (almost) always Generic detachable objects also share some characteristics Urban: Ground Indoors:Ground Today: Buildings Sky Walls Ceiling

Objects Objects Ground Structure Furniture Props Efficiently to segment RGB+3D scenes into these general classes to be used as a prior for specific task detectors Semantic Parsing for Priming Object Detection in RGB-D Scenes 5/5/2013 Our Problem However:

There are things we can assume to be present (almost) always Generic detachable objects also share some characteristics Urban: Ground Indoors:Ground Today: Buildings Sky Walls Ceiling Objects Objects Ground Structure Furniture Props

Efficiently to segment RGB+3D scenes into these general classes to be used as a prior for specific task detectors Semantic Parsing for Priming Object Detection in RGB-D Scenes 5/5/2013 NYU Depth v2 N. Silberman, D. Hoiem, P. Kohli, and R. Fergus, Indoor segmentation and support inference from RGBD images, in ECCV, 2012. 1449 labeled frames. 26 scenes classes. Labeling spans over 894 different classes.

Thanks to N. Silberman for proving the mapping 894 to 4 classes. Semantic Parsing for Priming Object Detection in RGB-D Scenes 5/5/2013 The System Semantic Segmentation Semantic Parsing for Priming Object Detection in RGB-D Scenes MAP

Margin als 5/5/2013 Different approaches Semantic Segmentation N. Silberman et al. ECCV 2012 C. Couprie et al. CoRR 2013 X. Ren et al. CVPR 2012 D. Munoz et al. ECCV 2010 I. Endres and D. Hoeim, ECCV 2010

MAP Margin als They have at least one: Expensive oversegmentation Expensive features Expensive Inference Semantic Parsing for Priming Object Detection in RGB-D Scenes 5/5/2013 Our approach Semantic

Segmentation MAP Margin als Conditional Random Fields Preprocessin g Graph Structure Potentia ls Semantic Parsing for Priming Object Detection in RGB-D Scenes

Inferenc e 5/5/2013 Outline MAP Margin als Conditional Random Fields Preprocessin g (1) (2) Graph Structure Potentia (3)

ls Semantic Parsing for Priming Object Detection in RGB-D Scenes (5)Results (6)Conclusio ns Inferenc e (4) 5/5/2013 Preprocessing: Over-segmentation SLIC superpixels R. Achanta, A. Shaji, K. Smith, A. Lucchi, P. Fua, and S. Susstrunk,

SLIC superpixels compared to state-of-the-art superpixel methods, PAMI, 2012. Semantic Parsing for Priming Object Detection in RGB-D Scenes 5/5/2013 Graph Structure Classical choice on images Semantic Parsing for Priming Object Detection in RGB-D Scenes 5/5/2013 Graph Structure: Our choice Minimum Spanning

Tree Over 3D Semantic Parsing for Priming Object Detection in RGB-D Scenes 5/5/2013 Graph Structure: Our choice Minimum Spanning Tree Over 3D Semantic Parsing for Priming Object Detection in RGB-D Scenes 5/5/2013

Potentials: Pairwise CRFs Semantic Parsing for Priming Object Detection in RGB-D Scenes 5/5/2013 Potentials: Pairwise CRFs Semantic Parsing for Priming Object Detection in RGB-D Scenes 5/5/2013

Potentials: Pairwise CRFs Semantic Parsing for Priming Object Detection in RGB-D Scenes 5/5/2013 Potentials: unary frequency of label j in a k-NN query frequency of label j the database

J. Tighe and S. Lazebnik, Superparsing: Scalable nonparametric image parsing with superpixels, ECCV 2010. The database is a kd-tree of features from training data Semantic Parsing for Priming Object Detection in RGB-D Scenes 5/5/2013 Features From Image:

12D mean of Lab color space 3D vertical pixel location 1D entropy from vanishing points 1D From 3D

height and depth 2D mean and std of differences on depth 2D local planarity 1D neighboring planarity 1D vertical orientation 1D Semantic Parsing for Priming Object Detection in RGB-D Scenes 5/5/2013

Features From Image: entropy from vanishing points Semantic Parsing for Priming Object Detection in RGB-D Scenes 5/5/2013 Features From 3D

mean and std of differences on depth Semantic Parsing for Priming Object Detection in RGB-D Scenes 5/5/2013 Features From 3D mean and std of differences on depth Semantic Parsing for Priming Object Detection in RGB-D Scenes

5/5/2013 Features From 3D mean and std of differences on depth local planarity neighboring planarity vertical orientation

Semantic Parsing for Priming Object Detection in RGB-D Scenes 5/5/2013 Potentials: pairwise Lab color Semantic Parsing for Priming Object Detection in RGB-D Scenes 5/5/2013 Inference

We use belief propagation: Exact results in MAP/marginals Efficient computation, in Thanks to our graph structure choice! Semantic Parsing for Priming Object Detection in RGB-D Scenes 5/5/2013

Results: NYU-D v2 Dataset GT MAP Semantic Parsing for Priming Object Detection in RGB-D Scenes 5/5/2013 Results: NYU-D v2 Dataset Confusion matrix: Comparisons:

Semantic Parsing for Priming Object Detection in RGB-D Scenes 5/5/2013 Results: NYU-D v2 Dataset Confusion matrix: Comparisons: Semantic Parsing for Priming Object Detection in RGB-D Scenes

5/5/2013 Results: NYU-D v2 Dataset Some failures: GT MAP Semantic Parsing for Priming Object Detection in RGB-D Scenes 5/5/2013 Results: NYU-D v2 Dataset Semantic Parsing for Priming Object

Detection in RGB-D Scenes 5/5/2013 Marginal probabilities P(Ground) P(Structure) P(Furniture) P(Props) Provide very useful information for specific tasks, e.g. :

Specific object detection Support inference Semantic Parsing for Priming Object Detection in RGB-D Scenes 5/5/2013 Conclusions We have presented a computational efficient approach for semantic segmentation of priming objects in indoors.

Our approach effectively uses 3D and Images cues. Depth discontinuities are evidence for occlusions The MST over 3D keeps intra-class components coherently connected. Semantic Parsing for Priming Object Detection in RGB-D Scenes 5/5/2013 Discussion Silberman et al. 2012

Learned features (>1000D) Select meaningful features (12D) Neural Networks k-NN None MST over 3D Local classifier: Logistic Regression

Ours. Features: Bunch of engineered features (>1000D) Couprie et al. 2013 Graph structure Dense Connections Image

Semantic Parsing for Priming Object Detection in RGB-D Scenes 5/5/2013 Thanks!! Cesar Cadena [email protected] Jana Kosecka [email protected] Funded by the US Army Research Office Grant W911NF-1110476. Semantic Parsing for Priming Object Detection in RGB-D Scenes 5/5/2013 Working on:

People detection by Shenghui Zhou Semantic Parsing for Priming Object Detection in RGB-D Scenes 5/5/2013 Multi-view and video: Semantic Parsing for Priming Object Detection in RGB-D Scenes 5/5/2013 Multi-view and video: Semantic Parsing for Priming Object

Detection in RGB-D Scenes 5/5/2013 Multi-view and video: Semantic Parsing for Priming Object Detection in RGB-D Scenes 5/5/2013 Multi-view and video: Semantic Parsing for Priming Object Detection in RGB-D Scenes 5/5/2013

Multi-view and video: Semantic Parsing for Priming Object Detection in RGB-D Scenes 5/5/2013 Multi-view and video: Semantic Parsing for Priming Object Detection in RGB-D Scenes 5/5/2013

Recently Viewed Presentations

  • www.dalelessmann.com

    www.dalelessmann.com

    Arial Bembo Times New Roman Wingdings Dale&Lessmann Template De Vastgoed Markt in Toronto - Overzicht Studiebezoek Technische Universiteit Eindhoven bij Consulaat-Generaal der Nederlanden, Toronto, Canada 6 april 2009 - Inhoud Slide 3 Slide 4 Slide 5 Slide 6 Koop/verkoopproces Bij...
  • STS - Seminario de Entrenamiento para el xito

    STS - Seminario de Entrenamiento para el xito

    STS - Seminario de Entrenamiento para el Éxito. Sábado 12 de Octubre - BOGOTÁ - COLOMBIA. Hora: De 9:00 am a 4:30 pm. Lugar: Centro de Convenciones Montevideo - Calle 19 # 65B - 67
  • Bridging Theory and Practice in Key ... - University of Warwick

    Bridging Theory and Practice in Key ... - University of Warwick

    In practice, none of these protocols have been implemented (to the best of my knowledge). All them require a trusted third party to define protocol parameters. How to realize such a trusted third party?
  • Parts of APA Manuscript - jwalkonline.org

    Parts of APA Manuscript - jwalkonline.org

    Body - Results Section. Purpose. To relate the findings of your research. Be succinct, concise, no imagination. Guidelines. Report results of hypotheses tests in order. Describe size and direction of significant results. Include all necessary stats to support conclusions (no...
  • World Music Spider Chart - Amazon Web Services

    World Music Spider Chart - Amazon Web Services

    Ionian (C major) Aeolian (A minor) Some examples Recorder Organ Harp, Psaltery Male voices (Monks) Pipe and Tabor (Drum) Viols Organ, virginal, Hurdy Gurdy Lute (sort of Guitar) Male voices Genres Plain song Secular Religious Religious Secular Ayres (Songs) Mass...
  • Free Social Media Presentation Insert the Sub Title

    Free Social Media Presentation Insert the Sub Title

    Marketing . Revolution. SOCIAL MEDIA. Get a modern PowerPoint Presentation that is beautifully designed. I hope and I believe that this Template will your Time, Money and Reputation. Easy to change colors, photos and Text. You can simply impress your...
  • Morphological, cerebral haemodynamic and neuropsychological ...

    Morphological, cerebral haemodynamic and neuropsychological ...

    During HUTT significant differences were observed between the baseline and 6 months BP values in the well controlled group (BPdia: p=0,01), while during the HUTT the BP of poorly controlled patients remained significantly worse compared to well treated ones (p=0,01).
  • Data Assimilation Experiments using Quality Controlled AIRS ...

    Data Assimilation Experiments using Quality Controlled AIRS ...

    The El Niño Correlation (ENC) for a given grid point is the temporal correlation of the anomaly time series for that grid point with the El Niño Index. ENCs should be less time period dependent than ARCs. Example of AIRS...