Title: Deep Learning for Text Generation Speaker: Arvind Agarwal, IBM Research Date: Thursday 02 November, 2017 Time: 1530hrs Venue: KD101, CSE Department Abstract: The success of Deep Learning methods in the last decade has made them the methods of choice for Computer Vision applications. Now these methods have started to make their way into Natural Language processing as well. In this presentation, I'll talk about some of the work that we have done in the field of deep learning for paraphrase generation and document summarization problems. I shall in particular talk about generative methods for both of these problems where the goal is actually to generate/synthesize text. For the paraphrase generation problem, we have proposed a deep generative framework, in particular, a Variational Autoencoders based architecture, augmented with sequence-to-sequence models. Unlike traditional VAE, our model conditions the encoder and decoder sides of the VAE on the input sentence, and therefore can generate multiple paraphrases for a given sentence in a principled way. We evaluate our method on benchmark datasets, and show that it outperforms the state-of-the-art by a significant margin. The generated paraphrases are not just semantically similar to the original input sentence, but also able to capture new concepts. For the summarization problem, we will focus on both extractive and abstractive summarization. In extractive summarization, I will talk about a hierarchical deep neural network where all sentences in a document are encoded into their distributional representations which are subsequently sent to a classifier to make a binary decision whether to keep them in the summary or not. For the abstractive summarization, the sentence representations obtained from the hierarchical network are instead sent to a decoder which uses text generation methods to construct a new summary. While deep learning methods have reported better accuracy than traditional methods for both of these problems, they have in particular been useful for abstractive summarization where these methods have made it possible to generate text by directly learning it from data without any human encoded knowledge. Bio: Arvind Agarwal is currently a research scientist at IBM Research - India (New Delhi). Prior to joining IBM, he was a research scientist at Palo Alto Research Center (PARC), Webster, NY. His research interests are in the areas of Machine Learning, Natural Language Processing, Deep Learning, and Text Analytics . He is especially interested in the Machine Learning sub-areas that deal with the problem of no (or limited) supervised data, such as self-learning, semi(un)-supervised learning, zero shot learning, domain adaptation, multitask learning etc. His recent interests also include application of deep learning methods to the problems related to text generation. Arvind completed his PhD in Computer Science from University of Maryland, College Park, USA. He received his M.S. in Computer Science from School of Computing, University of Utah, USA. His bachelor degree is from Birla Institute of Technology & Science (BITS), Pilani, India. He has published several papers in machine learning and data mining conferences such as KDD, NIPS, IJCAI, ATSTATS and others. He is also a recipient of the ECML 2010 best student paper award.