Title: Weakly Supervised Dynamic models for Facial Analysis in Videos Abstract: Previous successful approaches for facial expression classification in videos have primarily relied on global pooling based approaches. Such methods often assume presence of a single uniform event spanning the sequence and discard local temporal information. Thus these methods might not be optimal in the case of learning prediction models for (i) weakly supervised setting, and (ii) unsegmented videos with multiple events. We have tried to tackle these challenges by explicitly modeling both weak labels and the dynamic nature of facial expressions. I will first briefly discuss our work on weakly supervised learning for Pain classification in videos [2, 3]. This framework combines multiple segment representation in videos with Multiple Instance Learning (MIL) framework to both classify and localize target expression in a video. I will also discuss both qualitative and quantitative results highlighting the advantages of this method. Despite the advantages of MIL, it identifies only a single discriminative event and therefore also fails to include any dynamical patterns in facial expressions. In our second work, we propose a generalization of MIL referred to as Local Ordinal Models (LOMo) [1]. This approach is based on a novel latent structured SVM (LSVM) formulation that jointly learns the sub-events and a prior on their ordering. LOMo differs from previous LSVM models in using a 'loosely structured' formulation and we propose an effective SGD based hinge loss minimization objective to solve it. We also show that LOMo achieves consistent improvements over relevant competitive baselines on four challenging facial analysis tasks. In combination with complimentary features, our method reports state-of-the-art results on these datasets. I will close the talk with a discussion about our current work on extending LOMo to unconstained human action recognition in video. [1] Sikka, K., Sharma, G., Bartlett, M. (2016). LOMo: Latent Ordinal Model for Facial Analysis in Videos. Computer Vision and Pattern Recognition (CVPR). [2] Sikka, K., Dhall, A. and Bartlett, M. (2014). Weakly Supervised Pain Localization and Classification with Multiple Segment Learning. The Best of Face and Gesture 2013, Image and Vision Computing (IVC). [3] Sikka, K., Dhall, A., and Bartlett, M. (2013). Weakly Supervised Pain Localization using Multiple Instance Learning. IEEE International Conference on Automatic Face and Gesture Recognition (AFGR). Bio: Karan Sikka completed his bachelor from Indian Institute of Technology Guwahati in 2010 and is currently a Final year PhD student at University of California San Diego working with Dr. Marian Bartlett. His research is focused on building robust machine learning models for classifying facial expressions in videos. In particular the focus is on using Weakly Supervised Learning approaches to tackle underlying challenges in recognizing natural expressions. He work on using Multiple Instance Learning for pain classification was awarded the Best Student Paper Honorable Mention Award. His team also stood 2nd in the first Facial Expression in the Wild Challenge in 2013 and was also awarded the Best Paper Award. His long-term research plan is to use effective and richer machine learning models for understanding human behavior and also reveal interesting aspects using these models.