My main research area is in the field of Machine Learning. In specific I am motivated by the challenges posed in learning efficient representations of data, both with the motivation of reducing computational cost but also with the aim of discovering underlying intrinsic characteristics of problems. We try to formulate and deal with such using probabilistic frameworks.

There are several different application fields which are offering interesting challenges for Machine Learning, in specific I am interested in Robotic, Computer Graphics and Computer Vision applications. What makes these fields interesting from a learning and representation perspective is that each of them are characterised by data observed and represented in very high-dimensional spaces. This is the case in many fields, however what makes these specifically interesting is that this high-dimensional data often can be naturally interpreted by a human observer. As an example, human pose is usually represented using up to about 100 joint angles, interpreting this data is very hard, however, map it onto a skeleton and display it to a human observer which can easily interpret and draw conclusions from the data. This is advantageous as we can easily evaluate models of such data.

Bayesian Networks
Bayesian Networks are directed Graphical Models which uses conditional independencies in the data in order to factorize the joint distribution of an observation space into a more compact representation. The benefits of this is many, firstly the factorisation is a key component that contains a lot of information of the data, how and what variables are dependent on each other. Further, the factorisation have the potential to lead to a much more efficient and compact model making us capable to learn from significantly less data. However, finding this structure poses a significant challenge as the state-space of possible structures increases super-exponentially with the number of observed variables. We are interested in reducing the complexity by learning an efficient parametrisation of the observed data by exploiting correlations in the data. Further we are interested in formulating principled methods for learning structures. This work includes formulating structural priors and optimisation techniques for learning.

BN


Applications of Bayesian Networks
Currently we are using Bayesian Networks for modelling the observation space of a Human - Robot imitation learning scenario for the specific task of Grasping. Such a task is characterised by a very high-dimensional observation space of variables with complex distributions. Using a Bayesian Network we are trying to uncover the structures of the relationships between different variables. Our motivation is that such a model should both be able to discover the non-obvious constraints of a specific task and also be capable of transferring information between the human and the robot.

Semantic Video Classification


Models for Facial Motion
Facial motion is a significant cue of information content in human interaction. Therefore building models capable of facial motion is of key importance in a large range of fields such as Human Robot Interaction, Computer Graphics but also in fields such as Experimental Psychology. A good model should allow us to extract information from motion, generate plausible natural looking motion and also make it possible to study the correlation with underlying factors such as gender and mood. Potential applications range from generating realistic avatars for computer games to controlling robotic faces for medical purposes. We are collaborating together with the University of Bristol on such models.


Shared Gaussian Process Latent Variable Models (SGP-LVM):
Many modelling scenarios are characterised by several different observation streams of a single common underlying phenomenon. This can be several different views of the same problem or measurements coming from different modalities. We have extended the Gaussian Process Latent Variable Model (GP-LVM) to be able to model these types of data. In specific we have developed models for the scenario where the variance in the observed data streams are completely shared which results in a pure shared model and for the more general scenario where in addition to shared information each observed space also contains information that is private to its domain.

code and examples

shared


Non Consolidating Component Analysis (NCCA):
One of the main drawback of the SGP-LVM model and other GP-LVM models in general is that it relies on a good initialisation of the latent space. This means that in the general case a GP-LVM model relies on the existence of an analogous model for initialisation. For a standard model several such models exists, most commonly a spectral dimensionality reduction (variations of MDS) scheme is applied. However, for the shared modelling scenario no such model is readily available. Canonical Correlation Analysis (CCA) is a classic algorithm for extracting directions and an embedding from data several data-streams that maximises the correlation between the domains. We have developed an algorithm referred to as Non Consolidating Component Analysis (NCCA) which extends classical CCA and kernel CCA to also model the private information contained in each domain thereby making it possible to view CCA as a dimensionality reduction scheme rather then a feature selection algorithm.

code

ncca


Factorized Orthogonal Latent Space Models (FOLS):
In the original work on the SGP-LVM model we used a combination of CCA and NCCA to initialise the latent location of the model. CCA finds directions of maximal correlation between data-sets. However, as correlation as a measure does not take in to account the amount of variance each direction models this can lead to undesirable representations of the data specifically in scenarios where each domain have been corrupted by correlated noise. As learning a GP-LVM model is a non-convex problem this can lead to significant errors in the representation of the data.

To tackle this issue we proposed a model called the FOLS model which by introducing additional constraints into the SGP-LVM framework finds a shared latent representation of the data factorized into orthogonal subspaces. Further by building on previous work on continuous dimensionality reduction we also remove the need for initialisation by initialising with the observed data itself.

code and examples

fols


Applications of Shared GP-LVM Models:
The models we have developed have been applied to model several different types of data in specific we have applied the models to the computer vision task of image based human pose estimation and ambiguity modelling for human robotic facial mapping . We are currently applying the models for multi-modal feature fusion for the task of geolocation.

pose