Articles, Code, Slides

Please read before using Research Code

  • Code used to produce results for research papers can be found linked alongside the papers themselves.

  • The code given or linked on this website is distributed in the hope that it will be useful, but without any warranty; without even the implied warranty of merchantability or fitness for a particular purpose.

  • If you use this code in your research, please cite the relevant publications given along with the links.

  • I would be happy to receive comments, feedback on the code, as well as the underlying methods.

Books/Monographs

  1. Non-convex Optimization for Machine Learning,
    Prateek Jain and P.K.,
    Foundations and Trends in Machine Learning [official copy],
    arXiv : 1712.07897 [stat.ML], 2017.
    [pdf]

Journal Publications

  1. Optimizing non-decomposable measures with deep networks,
    Amartya Sanyal, Pawan Kumar, P.K., Sanjay Chawla, and Fabrizio Sebastiani,
    Machine Learning 107(8-10): 1597-1620 (2018).
    Also at the 29th European Conference on Machine Learning (ECML), 2018.
    [open access link]

  2. Corruption-tolerant bandit learning,
    Sayash Kapoor, Kumar Kshitij Patel, and P.K.,
    Machine Learning 108(4): 687–715 (2019).
    [open access link]

  3. Epidemiologically and Socio-economically Optimal Policies via Bayesian Optimization,
    Amit Chandak, Debojyoti Dey, Bhaskar Mukhoty, and P. K.,
    Transactions of the Indian National Academy of Engineering (Special issue on Technologies for Fighting COVID-19) 5(2): 117–127 (2020).
    [arXiv], [code]

  4. Robust statistical calibration and characterization of portable low-cost air quality monitoring sensors to quantify real-time O3 and NO2 concentrations in diverse environments,
    Ravi Sahu, Ayush Nagal, Kuldeep Kumar Dixit, Harshavardhan Unnibhavi, Srikanth Mantravadi, Srijith Nair, Yogesh Simmhan, Brijesh Mishra, Rajesh Zele, Ronak Sutaria, Vidyanand Motiram Motghare, P. K., and Sachchida Nand Tripathi
    Atmospheric Measurement Techniques 14(1): 37–52 (2021).
    [open access link], [code]

  5. Robust non-Parametric Regression via Incoherent Subspace Projections,
    Bhaskar Mukhoty, Subhajit Dutta, and P.K.,
    Machine Learning 110: 2941–2989.
    Also at the 32nd European Conference on Machine Learning (ECML), 2021.
    [open access link], [full pdf], [compressed pdf], [code]
    Please note that the full PDF size is approx 38 MB due to high resolution images
    If you have low bandwidth, you may download the compressed PDF but please note that it has low resolution images

  6. Prutor: an intelligent learning and management system for programming courses,
    Amey Karkare and P.K.,
    Communications of the ACM 65(11): 62–64 (2022).
    [link]

Conference Publications

  1. INGIT: Limited Domain Formulaic Translation from Hindi to Indian Sign Language,
    P. K., Madhusudan Reddy, Amitabha Mukerjee, and Achla M Raina,
    5th International Conference on Natural Language Processing (ICON), pages 69-78, 2007.
    [pdf], [pdf-slides 1], [pdf-slides 2], [pdf-slides 3]

  2. On Low Distortion Embeddings of Statistical Distance Measures into Low Dimensional Spaces,
    Arnab Bhattacharya, P. K., and Manjish Pal,
    20th International Conference on Database and Expert Systems Applications (DEXA), 2009,
    Springer Lecture Notes in Computer Science (LNCS), 5690:164-172, 2009,
    arXiv: 0909.3169 [cs.CG], 2009.
    [pdf], [pdf-slides]

  3. Estimating the First Frequency Moment of Data Streams in Nearly Optimal Space and Time,
    Sumit Ganguly and P. K.,
    12th Italian Conference on Theoretical Computer Science (ICTCS), 2010,
    arXiv: 1005.0809v1 [cs.DS], 2010.
    [pdf]

  4. Random Projection Trees Revisited,
    Aman Dhesi and P. K.,
    24th Annual Conference on Neural Information Processing Systems (NIPS), 2010,
    arXiv: 1010.3812 [cs.DS], 2010.
    [pdf], [poster]

  5. Similarity-based Learning via Data driven Embeddings,
    P. K. and Prateek Jain,
    25th Annual Conference on Neural Information Processing Systems (NIPS), 2011,
    arXiv: 1112.5404 [cs.LG], 2011.
    [pdf], [poster]

  6. Random Feature Maps for Dot Product Kernels,
    P. K. and Harish Karnick,
    15th International Conference on Artificial Intelligence and Statistics (AISTATS), 2012,
    Journal of Machine Learning Research (JMLR) : W&CP, 22:583-591, 2012,
    arXiv: 1201.6530 [cs.LG], 2012.
    [pdf], [code], [poster]

  7. Supervised Learning with Similarity Functions,
    P. K. and Prateek Jain,
    26th Annual Conference on Neural Information Processing Systems (NIPS), 2012,
    arXiv: 1210.5840 [cs.LG], 2012.
    [pdf], [poster]

  8. On the Generalization Ability of Online Learning Algorithms for Pairwise Loss Functions,
    P. K., Bharath Sriperumbudur, Prateek Jain, and Harish Karnick,
    30th International Conference on Machine Learning (ICML), 2013 (Oral Presentation),
    Journal of Machine Learning Research (JMLR), W&CP, 28(3):441-449, 2013,
    arXiv: 1305.2505 [cs.LG], 2013.
    [pdf], [code], [poster], [pdf-slides]

  9. Large-scale Multi-label Learning with Missing Labels,
    Hsiang-Fu Yu, Prateek Jain, P. K., and Inderjit S. Dhillon,
    31st International Conference on Machine Learning (ICML), 2014,
    Journal of Machine Learning Research (JMLR), W&CP, 32(1):593-601, 2014,
    arXiv: 1307.5101 [cs.LG], 2013.
    [pdf], [code], [poster]

  10. On Iterative Hard Thresholding Methods for High-dimensional M-Estimation,
    Prateek Jain, Ambuj Tewari, and P. K.,
    28th Annual Conference on Neural Information Processing Systems (NIPS), 2014,
    Also at the 22nd International Symposium on Mathematical Programming (ISMP), 2015,
    Also at the 7th NIPS Workshop on Optimization for Machine Learning (OPT), December 12-13, 2014,
    arXiv: 1410.5137 [cs.LG], 2014.
    [pdf], [poster]

  11. Online and Stochastic Gradient Methods for Non-decomposable Loss Functions,
    P. K., Harikrishna Narasimhan, and Prateek Jain,
    28th Annual Conference on Neural Information Processing Systems (NIPS), 2014,
    Also at the Symposium on Learning, Algorithms and Complexity (LAC), IISc, January 5-9, 2015,
    arXiv: 1410.6776 [cs.LG], 2014.
    [pdf], [poster]

  12. Optimizing Non-decomposable Performance Measures: A Tale of Two Classes,
    Harikrishna Narasimhan, P. K., and Prateek Jain,
    32nd International Conference on Machine Learning (ICML), 2015.
    Journal of Machine Learning Research (JMLR), W&CP, 37, 2015,
    arXiv: 1505.06812 [stat.ML], 2015.
    [pdf], [poster]

  13. Surrogate Functions for Maximizing Precision at the Top,
    P. K., Harikrishna Narasimhan, and Prateek Jain,
    32nd International Conference on Machine Learning (ICML), 2015.
    Journal of Machine Learning Research (JMLR), W&CP, 37, 2015,
    arXiv: 1505.06813 [stat.ML], 2015.
    [pdf], [poster], [pptx-slides]

  14. Sparse Local Embeddings for Extreme Multi-label Classification,
    Kush Bhatia, Himanshu Jain, P.K., Manik Varma, and Prateek Jain,
    29th Annual Conference on Neural Information Processing Systems (NIPS), 2015,
    Also at the ICML Workshop on Extreme Classification, July 10, 2015.
    arXiv : 1507.02743 [cs.LG], 2015.
    [pdf], [poster]

  15. Robust Regression via Hard Thresholding,
    Kush Bhatia, Prateek Jain, and P.K.,
    29th Annual Conference on Neural Information Processing Systems (NIPS), 2015,
    Also at the IEEE Information Theory Workshop (ITW), October 11-15, 2015 (Invited Paper).
    Also at the NIPS Workshop on Non-convex Optimization for Machine Learning, December 11-12, 2015.
    arXiv : 1506.02428 [cs.LG], 2015.
    [pdf], [poster 1], [poster 2]

  16. Stochastic Optimization Techniques for Quantification Performance Measures,
    P.K., Shuai Li, Harikrishna Narasimhan, Sanjay Chawla, and Fabrizio Sebastiani,
    22nd ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), 2016.
    arXiv : 1605.04135 [stat.ML], 2016.
    [pdf]

  17. Optimizing the Multiclass F-measure via Biconcave Programming,
    Weiwei Pan, Harikrishna Narasimhan, P.K., Pavlos Protopapas, and Harish G. Ramaswamy,
    16th IEEE International Conference on Data Mining (ICDM), 2016.
    [pdf]

  18. Scalable Optimization of Multivariate Performance Measures in Multi-instance Multi-label Learning,
    Apoorv Aggarwal, Sandip Ghoshal, Ankith M S, Suhit Sinha, Ganesh Ramakrishnan, P. K., and Prateek Jain,
    31st AAAI Conference on Artificial Intelligence (AAAI), 2017.
    [pdf]

  19. On Context-Dependent Clustering of Bandits,
    Claudio Gentile, Shuai Li, P. K., Alexandros Karatzoglou, Giovanni Zappella and Evans Etrue Howard,
    34th International Conference on Machine Learning (ICML), 2017.
    [pdf]

  20. Consistent Robust Regression,
    Kush Bhatia, Prateek Jain, Parameswaran Kamalaruban, and P.K.,
    31st Annual Conference on Neural Information Processing Systems (NeurIPS), 2017.
    Also at the 10th NIPS Workshop on Optimization for Machine Learning (OPT), December 8, 2017,
    [pdf]

  21. Compilation error repair: for the student programs, from the student programs,
    Umair Z. Ahmed, Pawan Kumar, Amey Karkare, P.K., and Sumit Gulwani,
    40th International Conference on Software Engineering: Software Engineering Education and Training (ICSE-SEET), 2018,
    [pdf]

  22. Globally-convergent Iteratively Reweighted Least Squares for Robust Regression Problems,
    Bhaskar Mukhoty, Govind Gopakumar, Prateek Jain, and P.K.,
    22nd International Conference on Artificial Intelligence and Statistics (AISTATS), 2019,
    [pdf]

  23. Accelerating Extreme Classification via Adaptive Feature Agglomeration,
    Ankit Jalan and P.K.,
    28th International Joint Conference on Artificial Intelligence (IJCAI), 2019.
    arXiv : 1905.11769 [cs.LG], 2019.
    [pdf] [code]

  24. MACER: A Modular Framework for Accelerated Compilation Error Repair,
    Darshak Chhatbar, Umair Z Ahmed, and P.K.,
    21st International Conference on Artificial Intelligence in Education (AIED), 2020.
    [arXiv], [code]

  25. DECAF : Deep Extreme Classification with Label Features,
    Anshul Mittal, Kunal Dahiya, Sheshansh Agrawal, Deepak Saini, Sumeet Agarwal, P.K., and Manik Varma,
    14th ACM International Conference on Web Search and Data Mining (WSDM), 2021.
    [pdf], [code]

  26. ECLARE: Extreme Classification with Label Graph Correlations,
    Anshul Mittal, Noveen Sachdeva, Sheshansh Agrawal, Sumeet Agarwal, P.K., and Manik Varma,
    30th The Web Conference (TheWebConf – formerly WWW), 2021.
    [pdf], [code]

  27. SiameseXML: Siamese Networks meet Extreme Classifiers with 100M Labels,
    Kunal Dahiya, Ananye Agarwal, Deepak Saini, Gururaj K, Jian Jiao, Amit Singh, Sumeet Agarwal, P.K., and Manik Varma,
    38th International Conference on Machine Learning (ICML), 2021.
    [pdf], [supplementary], [code]

  28. AGGLIO: Global Optimization for Locally Convex Functions,
    Debojyoti Dey, Bhaskar Mukhoty, and P.K.,
    9th ACM IKDD Conference on Data Science (CODS), 2022.
    [arXiv], [code]

  29. IGLU: Efficient GCN Training via Lazy Updates,
    S Deepak Narayanan, Aditya Sinha, Prateek Jain, P.K., and Sundararajan Sellamanickam,
    10th International Conference on Learning Representations (ICLR), 2022.
    [paper]

  30. Multi-modal Extreme Classification,
    Anshul Mittal, Shreya Malani, Kunal Dahiya, Janani Ramaswamy, Seba Kuruvilla, Jitendra Ajmera, Keng-Hao Chang, Sumeet Agarwal, P.K., and Manik Varma,
    35th IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2022.

  31. NGAME: Negative Mining-aware Mini-batching for Extreme Classification,
    Kunal Dahiya, Nilesh Gupta, Deepak Saini, Akshay Soni, Yajun Wang, Kushal Dave, Jian Jiao, Gururaj K, Prasenjit Dey, Amit Singh, Deepesh Hada, Vidit Jain, Bhawna Paliwal, Anshul Mittal, Sonu Mehta, Ramachandran Ramjee, Sumeet Agarwal, P.K., and Manik Varma,
    16th ACM International Conference on Web Search and Data Mining (WSDM), 2023.

  32. Gradient Perturbation-based Efficient Deep Ensembles,
    Amit Chandak, P.K. and Piyush Rai,
    10th ACM IKDD Conference on Data Science (CODS), 2023.

  33. Advances in Automated Pedagogical Error Repair,
    Sharath H. Padmanabha, Fahad Shaikh, Mayank Bansal, Debanjan Chatterjee, Preeti Singh, Amey Karkare, and P.K.,
    16th Innovations in Software Engineering Conference (ISEC), 2023.

  34. PRIORITY: An Intelligent Problem Indicator Repository,
    Sharath H. Padmanabha, Fahad Shaikh, Mayank Bansal, Debanjan Chatterjee, Preeti Singh, Amey Karkare, and P.K.,
    16th Innovations in Software Engineering Conference (ISEC), 2023.

  35. Corruption-tolerant Algorithms for Generalized Linear Models,
    Bhaskar Mukhoty, Debojyoti Dey, and P.K.,
    31st AAAI Conference on Artificial Intelligence (AAAI), 2023.

Miscellaneous and (permanently) arXiv-ed articles

  1. The Ultra Experience of a non-Athlete,
    An article on how I came to participate in an ultra marathon.
    P. K.,
    2010.
    [pdf]

  2. Why we respect our Teachers : A Note on Language Learnability and Active Learning,
    P. K.,
    Notes on Engineering Research and Development, 3(3):31-36, 2011.
    [pdf]

    I once ran in an ultra-marathon. It was such an experience someone asked me to write about it.

  3. Learning with Supportive Vectors : An Introduction to Support Vector Machines and their Applications,
    P. K.,
    Notes on Engineering Research and Development, 3(1):2-6, 2010.
    [pdf]

  4. On Translation Invariant Kernels and Screw Functions,
    P. K. and Harish Karnick,
    arXiv : 1302.4343 [math.FA], 2013.
    [pdf]

    A while ago I contributed a couple of introductory articles to a student-led publication called NERD (Notes on Engineering Research and Development) [url].

  5. Generalization Guarantees for a Binary Classification Framework for Two-Stage Multiple Kernel Learning,
    P. K.,
    arXiv : 1302.0406 [cs.LG], 2013.
    [pdf]