Morel-Forster, Andreas. Generative shape and image analysis by combining Gaussian processes and MCMC sampling. 2016, Doctoral Thesis, University of Basel, Faculty of Science.
|
PDF
Available under License CC BY-NC-ND (Attribution-NonCommercial-NoDerivatives). 52Mb |
Official URL: http://edoc.unibas.ch/diss/DissB_12237
Downloads: Statistics Overview
Abstract
Fully automatic analysis of faces is important for automatic access control, human computer interaction or for automatically evaluate surveillance videos. For humans it is easy to look at and interpret faces. Assigning attributes, moods or even intentions to the depicted person seem to happen without any difficulty. In contrast computers struggle even for simple questions and still fail to answer more demanding questions like: "Are these two persons looking at each other?"
The interpretation of an image depicting a face is facilitated using a generative model for faces. Modeling the variability between persons, illumination, view angle or occlusions lead to a rich abstract representation. The model state encodes comprehensive information reducing the effort needed to solve a wide variety of tasks. However, to use a generative model, first the model needs to be built and secondly the model has to be adapted to a particular image. There exist many highly tuned algorithms for either of these steps. Most algorithms require more or less user input. These algorithms often lack robustness, full automation or wide applicability to different objects or data modalities.
Our main contribution in this PhD-thesis is the presentation of a general, probabilistic framework to build and adapt generative models. Using the framework, we exploit information probabilistically in the domain it originates from, independent of the problem domain. The framework combines Gaussian processes and Data-Driven MCMC sampling. The generative models are built using the Gaussian process formulation. To adapt a model we use the Metropolis Hastings algorithm based on a propose-and-verify strategy. The framework consists of different well separated parts. Model building is separated from the adaptation. The adaptation is further separated into update proposals and a verification layer. This allows to adapt, exchange, remove or integrate individual parts without changes to other parts.
The framework is presented in the context of facial data analysis. We introduce a new kernel exploiting the symmetry of faces and augment a learned generative model with additional flexibility. We show how a generative model is rigidly aligned, non-rigidly registered or adapted to 2d images with the same basic algorithm. We exploit information from 2d images to constrain 3d registration. We integrate directed proposal into sampling shifting the algorithm towards stochastic optimization. We show how to handle missing data by adapting the used likelihood model. We integrate a discriminative appearance model into the image likelihood model to handle occlusions. We demonstrate the wide applicability of our framework by solving also medical image analysis problems reusing the parts introduced for faces.
The interpretation of an image depicting a face is facilitated using a generative model for faces. Modeling the variability between persons, illumination, view angle or occlusions lead to a rich abstract representation. The model state encodes comprehensive information reducing the effort needed to solve a wide variety of tasks. However, to use a generative model, first the model needs to be built and secondly the model has to be adapted to a particular image. There exist many highly tuned algorithms for either of these steps. Most algorithms require more or less user input. These algorithms often lack robustness, full automation or wide applicability to different objects or data modalities.
Our main contribution in this PhD-thesis is the presentation of a general, probabilistic framework to build and adapt generative models. Using the framework, we exploit information probabilistically in the domain it originates from, independent of the problem domain. The framework combines Gaussian processes and Data-Driven MCMC sampling. The generative models are built using the Gaussian process formulation. To adapt a model we use the Metropolis Hastings algorithm based on a propose-and-verify strategy. The framework consists of different well separated parts. Model building is separated from the adaptation. The adaptation is further separated into update proposals and a verification layer. This allows to adapt, exchange, remove or integrate individual parts without changes to other parts.
The framework is presented in the context of facial data analysis. We introduce a new kernel exploiting the symmetry of faces and augment a learned generative model with additional flexibility. We show how a generative model is rigidly aligned, non-rigidly registered or adapted to 2d images with the same basic algorithm. We exploit information from 2d images to constrain 3d registration. We integrate directed proposal into sampling shifting the algorithm towards stochastic optimization. We show how to handle missing data by adapting the used likelihood model. We integrate a discriminative appearance model into the image likelihood model to handle occlusions. We demonstrate the wide applicability of our framework by solving also medical image analysis problems reusing the parts introduced for faces.
Advisors: | Vetter, Thomas and Roth, Volker |
---|---|
Faculties and Departments: | 05 Faculty of Science > Departement Mathematik und Informatik > Ehemalige Einheiten Mathematik & Informatik > Computergraphik Bilderkennung (Vetter) |
UniBasel Contributors: | Vetter, Thomas and Roth, Volker |
Item Type: | Thesis |
Thesis Subtype: | Doctoral Thesis |
Thesis no: | 12237 |
Thesis status: | Complete |
Number of Pages: | 1 Online-Ressource (II, 139 Seiten) |
Language: | English |
Identification Number: |
|
edoc DOI: | |
Last Modified: | 02 Aug 2021 15:14 |
Deposited On: | 02 Oct 2017 12:36 |
Repository Staff Only: item control page