SIGML - Special Interest Group, Machine Learning

Title : Learning with Complex Performance Metrics
Speaker : Nagarajan Natarajan (Microsoft Research, India)
Date : Jan 13, 2016 (Fri)
Time : 5pm
Venue : RM101

Abstract
Prediction tasks arising in modern day recommender systems often necessitate
complex performance metrics for evaluation. For instance, classification accuracy
(or the “0-1 loss”) metric is ill-suited for rare event classification problems
such as medical diagnosis, fraud detection, click rate prediction and text retrieval
applications. Practitioners instead employ alternative metrics better tuned to
imbalanced classification, such as the F-measure. An important theoretical question
concerning complex metrics is characterizing their optimal decision functions given
the inherent uncertainty in the data and the labeling process. The motivation is
that such a theoretical understanding would then lead to developing practical algorithms
for directly optimizing the desired metric. Conventional learning theoretic results
and a host of learning algorithms focus on “simple” metrics such as the squared loss
or the 0-1 loss. In this talk, I will discuss recent developments in trying to extend
traditional learning theoretic results to complex performance metrics used in modern
predictive tasks (multilabel learning, recommender systems with missing observations).
In particular, most of the complex performance metrics admit a simple optimal decision
function, that is characterized by thresholding the conditional likelihood of label
given instance at some threshold (much like the 0-1 loss, where the threshold is simply 0.5).

Much of the talk will draw upon our own work presented at NIPS '15 and '16. Finally, I will
highlight some open problems in this line of research. This is joint work with Oluwasanmi
Koyejo, Pradeep Ravikumar, Inderjit Dhillon and Prateek Jain.

Bio
Nagarajan Natarajan is currently a Post-doctoral Researcher at Microsoft Research in Bangalore.
He received his PhD in Computer Science in Fall 2009 from the University of Texas at Austin,
where he worked with Inderjit Dhillon and Pradeep Ravikumar on various theoretical and applied
machine learning problems. His research is primarily on statistical learning theory, with emphasis
on applications to problems in computational biology and modern recommender systems. At Microsoft
Research, he works on interesting open problems in learning theoretic guarantees for complex
performance metrics, resource-constrained machine learning, and robust learning (recommender systems,
multi-label learning). Refer to his google scholar profile https://scholar.google.co.in/citations… for details.