Project Proposal CS365

Semantic Segmentation using Random Forests

and

Web-Supervised Visual Learning

Mentor: Dr. Amitabha Mukerjee

- Rohan Jingar

- Mridul Verma

Semantic Segmentation

Introduction

Object detection and object classification in an image has been one of the most fascinating and highly applicative problems in AI. Many methods have been proposed mostly using the methods of machine learning by learning the properties of objects. This is a supervised learning problem as number of object classes is finite and predetermined. Humans are highly efficient when it comes to recognizing and classifying objects in an image than the current machines. Humans generally dont process much while doing such tasks as they focus only on the area of interest in the scene. This area is of great interest to many researchers working in the field of AI and image processing and we hope that in future we will be able to achieve human like efficiency and accuracy in this problem.

Semantic Segmentation of an Image

We have a predefined semantic categories of objects and we are given an image of a scene. We want to implement a working model which can segment the image into the different categories of objects i.e we want to group together the connected regions of the image which have the same semantic meanings and label them automatically.

Applications

Automatically extracting object images from a scene:
imagine a person who has his own personal collection of images, wouldnt it be great if he can query to get automatically all the images of himself or someone. Initially the person will manually tag other people in images but as the collection grows larger the system can learn and automatically tag people. And for general class of objects for which the system is already learned we can directly extract the images of those objects. This type of automatic tagging has already been deployed on Facebook.

Better Web Image search results:
Highly accurate and efficient object recognition methods can improve the image search results on the web. If we use Google image search some of the result images are not at all related to the search query, still they are included in the result.

Computationally efficient object detection methods have been deployed in some smartphones.

Approach:

We want to combine the Single Histogram Class Model and Random Forests to make a feature descriminator classifier. Random Forests have already been used in various supervised classification problem and are found to be much more efficient in terms of computations while training and classification. They are able to handle a variety of object features. The main use of Random Forests is that different feature types can easily be combined resulting in a more general classifier.

Data Sets used

MSRC Datasets: 21 class labeled database
object classes are: aeroplane, bicycle, bird, boat, body, book, building, car, cat, chair, cow, dog, face, flower, grass, road, sheep, sign, sky, tree, water.

PVOC2007:20 object classes and one background class
msrc database

Web Supervised Visual Learning

Goal

Our main target would be to retrieve specific classes images which would help in better training of the dataset (which would improve the results for image segmentation).

Motivation

As we know that due to the growth in technology the things are becoming more and more sophisticated due to which our main idea of our project which is Image Segmentation which in turn depends upon the classes and the visual cues that we give to the system for training soon becomes obsolete becuse the things are changing rapidly so in order to keep in track of the latest images of the object , we need to update the database again and again. This procedure in turn provides very well and good results for Image Segmentation as we have a very well trained dataset.

Approach

Approach would be as follows

First of all we would pick some significant amount of images from the web related to a particular topic (let's say 1000 images).

Then we wold remove the drawings and abstact images with the help of the image metadata

Now when we would have removed the drawings and abstract images then we would rank the images according to Bayes metadata ranker .

Bayes metadata ranker

This ranker ranks the image according to the various features of the images . The features are decided by the feature vector . Here the feature vector is binary. The feature could contain the features
a={filenam,filedir,imagealt,imagetitle,websitetitle,nearby context(related to the class)}

Now we would train our SVM(support vector machine) on the top ranked images as noisy training data.

Now we would finally re rank the images according to the new trained SVM(improved trained SVM).

References

[1] F. Schroff, A. Criminisi, and A. Zisserman. Object Class Segmentation using Random Forests. BMVC 2008.

[2] F. Schroff, A. Criminisi, and A. Zisserman. Harvesting image databases from the web. In ICCV, 2007.

[3] J. Shotton, M. Johnson, and R. Cipolla, Semantic texton forests for image categorization and segmentation, in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 1-8, Anchorage, USA, 2008.

Rohan Jingar and Mridul Verma

IITK | CSE | Project-Home