Visual Recognition

Instructor: Vinay P. Namboodiri

Lecture hours:
Tuesday 17:10 - 18:25 and Thursday 17:10 - 18:25
Venue: L19, LHC

Course Content

In this course we undertake a study of visual recognition from various aspects related to computer vision. Visual recognition the aim is to interpret semantic information from images. This is a task that humans excel in and the aim is to able to do so computationally. The kind of semantic information one seeks to obtain from images relate to the kind of entities present in an image (for instance, naming the kind of bird or type of vehicle present in an image). The challenge more generally can be thought of mathematically as one of learning a function Fw(x)-> y which takes a visual input x and generates a target output y by using a parameter vector w. Visual recognition is challenging due to the wide variety in the space of input x and the kind of outputs y. For instance, the images of even a restricted class of images such as faces exhibits lots of varieties due to factors such as pose, illumination, occlusion and orientation in addition to the inherent variety of human faces and therefore the task of recognizing faces is challenging. In this book we will consider the a variety of output tasks such as object recognition, object detection and object segmentation.

Current techniques based on deep learning are able to learn the above tasks but are able to do so assuming full supervision. However, obtaining such supervision for each task is challenging and not feasible always. Recently there has been interesting work towards solving these problems by reducing the amount of supervision available. This is done in various ways such as transfer learning, active learning, learning with weak supervision and unsupervised learning techniques. In the course, we aim to also consider such techniques that could be applicable for the various visual recognition tasks.

A brief outline of the topics to be covered in the course are as follows:

Introduction to visual recognition and the various problems
Instance Recognition
Features for visual recognition
Object Classification
Classical to Deep learning
Object Detection
Object Segmentation
Self Supervision
Weak Supervision
Domain Adaptation
Unsupervised visual recognition
Vision and Language

List of Teaching Assistants

Aman Deep Singh
Pravendra Singh
Saket Jhunjhunwala
Samik Some
Siddharth Singla
Utsav Singh

References

Computer Vision: Algorithms and Applications by Richard Szeliski Available online
Computer Vision: Models, Learning, and Inference by Simon J.D. Prince Available online
Deep Learning by Ian Goodfellow, Yoshua Bengio and Aaron Courville Available online
Computer Vision: A Modern Approach by Forsyth and Ponce Indian edition available

Course Discussion - Piazza

Link available over here

Assignment

Lecture Slides, notes and related reading

Lecture 1: Introduction
Lecture 2: Instance Recognition
Lecture 3: Local features
Lecture 4: SIFT
Lecture 5: Object Categorization
Lecture 6: Representations for Object Categorization 1
Lecture 7: Image Processing part 1 and part 2
Lecture 8: Representations for Object Categorization 2
Lecture 9: Neural Networks
Lecture 10: Convolutional Neural Networks
Lecture 11: Convolutional Neural Networks 2
Lecture 12: Object Detection: HoG
Lecture 13: Region based CNN for Object Detection: RCNN
Lecture 14: Deep Object Detection 1
Lecture 15: Deep Object Detection 2
Lecture 16: Segmentation
Lecture 17: FCN and UNet Segmentation
Lecture 18: Unsupervised learning by Context Prediction
Lecture 19: Self Supervised Learning
Lecture 20: Domain Adaptation
Lecture 21: Generative Adversarial Networks
Lecture 22: Understanding Motion and Actions
Lecture 23: 3D and Depth Estimation
Additional Reading material: Excerpt from Trucco and Verri book over here
Lecture 24: Recurrent Neural Networks
Lecture 25: Recurrent Neural Networks and other topics