ARNAB BHATTACHARYA


Associate Professor, Dept. of Computer Science and Engineering, Indian Institute of Technology, Kanpur.


Email: arnabb@iitk.ac.in, arnabb@cse.iitk.ac.in, arnabbhattacharya@gmail.com

Web Page: http://www.cse.iitk.ac.in/users/arnabb/

Address: Dept. of Computer Science and Engineering, Indian Institute of Technology, Kanpur, UP - 208016, India.

Phone: +91-512-679-7650, +91-512-392-7650, +91-512-259-7650.


Area of Research: Databases, Data Mining, Bioinformatics.


Experience:

  • Associate Professor, Dept. of Computer Science and Engineering, Indian Institute of Technology (IIT), Kanpur, India. June 2014 - present.
  • Assistant Professor, Dept. of Computer Science and Engineering, Indian Institute of Technology (IIT), Kanpur, India. December 2007 - June 2014.
  • Project Scientist, Dept. of Computer Science, University of California, Santa Barbara, CA, USA. September 2007 - November 2007.
  • Graduate Student Research Assistant, Dept. of Computer Science, University of California, Santa Barbara, CA, USA. July 2003 - August 2007.
  • Teaching Assistant, Dept. of Computer Science, University of California, Santa Barbara, CA, USA. September 2002 - June 2003.
  • Software Design Engineer, Texas Instruments (India) Ltd., Bangalore, India. July 2001 - July 2002.


Books:

  1. Fundamentals of Database Indexing and Searching. Arnab Bhattacharya. CRC Press, 2014.


Selected Publications:

  1. GARUDA: A System for Large-Scale Mining of Statistically Significant Connected Subgraphs. Satyajit Bhadange, Akhil Arora, Arnab Bhattacharya. Demo at International Conference on Very Large Data Bases (VLDB), 2016, to appear, New Delhi, India.

  2. SMS: Stable Matching Algorithm using Skylines. Rohit Anurag, Arnab Bhattacharya. International Conference on Scientific and Statistical Database Management (SSDBM), 2016, pages 24:1-24:4, Budapest, Hungary.

  3. SkyCover: Finding Range-Constrained Approximate Skylines with Bounded Quality Guarantees. Shubhendu Aggarwal, Shubhadip Mitra, Arnab Bhattacharya. International Conference on Management of Data (COMAD), 2016, pages 1-12, Pune, India.

  4. Probabilistic Aggregate Skyline Join Queries: Skylines with Aggregate Operations over Existential Uncertain Relations. Arnab Bhattacharya, Shrikant Awate. International Conference on Scientific and Statistical Database Management (SSDBM), 2015, pages 5:1-5:12, San Diego, USA.

  5. Trajectory Aware Macro-cell Planning for Mobile Users. Shubhadip Mitra, Sayan Ranu, Vinay Kolar, Arnab Bhattacharya, Ravi Kokku, Aditya Telang, Sriram Raghavan. IEEE International Conference on Computer Communications (INFOCOM), 2015, 792-800, Hong Kong, China.

  6. Generation of Random Triangular Digital Curves using Combinatorial Techniques. Apurba Sarkar, Arindam Biswas, Mousumi Dutt, Arnab Bhattacharya. International Conference on Pattern Recognition and Machine Intelligence (PReMI), 2015, pages 136-145, Warsaw, Poland.

  7. Using Social Connections to Improve Collaborative Filtering. Kanish Manuja, Arnab Bhattacharya. IKDD Conference on Data Sciences (CoDS), 2015, pages 140-141, Bengaluru, India.

  8. Generation of Random Digital Curves using Combinatorial Techniques. Apurba Sarkar, Arindam Biswas, Mousumi Dutt, Arnab Bhattacharya. Conference on Algorithms and Discrete Applied Mathematics (CALDAM), 2015, pages 286-297, Kanpur, India.

  9. Mining Statistically Significant Connected Subgraphs in Vertex Labeled Graphs. Akhil Arora, Mayank Sachan, Arnab Bhattacharya. SIGMOD International Conference on Management of Data (SIGMOD), 2014, pages 1003-1014, Snowbird, USA.

  10. Efficient and Effective Route Planning in Road Networks with Probabilistic Data using Skyline Paths. Arzoo Katiyar, Arnab Bhattacharya, Shubhadip Mitra. IKDD Conference on Data Sciences (CoDS), 2014, New Delhi, India.

  11. Emotion Recognition from Audio and Visual Data using F-score based Fusion. Abhishek Gera, Arnab Bhattacharya. IKDD Conference on Data Sciences (CoDS), 2014, New Delhi, India.

  12. RCached-tree: An Index Structure for Efficiently Answering Popular Queries. Manash Pal, Arnab Bhattacharya, Debjyoti Paul. International Conference on Information and Knowledge Management (CIKM), 2013, pages 1173-1176, San Francisco, USA.

  13. Efficient Edit Distance based String Similarity Search using Deletion Neighborhoods. Shashwat Mishra, Tejas Gandhi, Akhil Arora, Arnab Bhattacharya. EDBT/ICDT Workshops, 2013, pages 375-383, Genoa, Italy.

  14. Hybrid HBase: Leveraging Flash SSDs to Improve Cost per Throughput of HBase. Anurag Awasthi, Avani Nandini, Arnab Bhattacharya, Priya Sehgal. International Conference on Management of Data (COMAD), 2012, pages 68-79, Pune, India.

  15. A Plant Identification System using Shape and Morphological Features on Segmented Leaflets: Team IITK, CLEF 2012. Akhil Arora, Ankit Gupta, Nitesh Bagmar, Shashwat Mishra, Arnab Bhattacharya. CLEF (Online Notes/Labs/Workshop), 2012, Rome, Italy.

  16. Mining Statistically Significant Substrings using the Chi-Square Statistic. Mayank Sachan, Arnab Bhattacharya. International Conference on Very Large Data Bases (VLDB), 2012, pages 1052-1063, Istanbul, Turkey.

  17. Mining Statistically Significant Substrings using the Chi-Square Statistic. Mayank Sachan, Arnab Bhattacharya. Proceedings of the VLDB Endowment (PVLDB), 2012, 5(10), pages 1052-1063.

  18. Mining Statistically Significant Substrings Based on the Chi-Square Measure. Sourav Dutta, Arnab Bhattacharya. Book chapter in Pattern Discovery Using Sequence Data Mining: Applications and Studies edited by P. Kumar, P. R. Krishna and S. B. Raju. IGI Global, 2012.

  19. Minimally Infrequent Itemset Mining using Pattern-Growth Paradigm and Residual Trees. Ashish Gupta, Akshay Mittal, Arnab Bhattacharya. International Conference on Management of Data (COMAD), 2011, pages 57-68, Bengaluru, India. (Best paper)

  20. Caching Stars in the Sky: A Semantic Caching Approach to Accelerate Skyline Queries. Arnab Bhattacharya, B. Palvali Teja, Sourav Dutta. International Conference on Database and Expert Systems Applications (DEXA), 2011, pages 493-501, Toulouse, France.

  21. A Continuous Query System for Dynamic Route Planning. Nirmesh Malviya, Samuel Madden, Arnab Bhattacharya. International Conference on Data Engineering (ICDE), 2011, pages 792-803, Hannover, Germany.

  22. Finding the Bias and Prestige of Nodes in Networks based on Trust Scores. Abhinav Mishra, Arnab Bhattacharya. International World Wide Web Conference (WWW), 2011, pages 567-576, Hyderabad, India.

  23. Aggregate Skyline Join Queries: Skylines with Aggregate Operations over Multiple Relations. Arnab Bhattacharya, B. Palvali Teja. International Conference on Management of Data (COMAD), 2010, pages 15-26, Nagpur, India. (Best student paper)

  24. INSTRUCT: Space-Efficient Structure for Indexing and Complete Query Management of String Databases. Sourav Dutta, Arnab Bhattacharya. International Conference on Management of Data (COMAD), 2010, pages 27-38, Nagpur, India.

  25. Simulated Evolution and Learning, Proceedings of the 8th International Conference on Simulated Evolution and Learning (SEAL). Co-edited by K. Deb, A. Bhattacharya, N. Chakraborti, P. Chakroborty, S. Das, J. Dutta, S. K. Gupta, A. Jain, V. Aggarwal, J. Branke, S. J. Louis, K. C. Tan, Springer, 2010.

  26. Minimum Spanning Tree on Spatio-Temporal Networks. Viswanath Gunturi, Shashi Shekhar, Arnab Bhattacharya. International Conference on Database and Expert Systems Applications (DEXA), 2010, pages 149-158, Bilbao, Spain.

  27. Finding Top-k Similar Pairs of Objects Annotated with Terms from an Ontology. Arnab Bhattacharya, Abhishek Bhowmick, Ambuj K. Singh. International Conference on Scientific and Statistical Database Management (SSDBM), 2010, pages 214-232, Heidelberg, Germany.

  28. Most Significant Substring Mining based on Chi-square Measure. Sourav Dutta, Arnab Bhattacharya. Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD), 2010, pages 319-327, Hyderabad, India.

  29. Querying Spatial Patterns. Vishwakarma Singh, Arnab Bhattacharya, Ambuj K. Singh. International Conference on Extending Database Technology (EDBT), 2010, pages 418-429, Lausanne, Switzerland.

  30. Image Management for Biological Data. Arnab Bhattacharya, Vebjorn Ljosa. Book chapter in Encyclopedia of Database Systems edited by M. T. Ozsu and L. Liu. Springer, 2009.

  31. On Low Distortion Embeddings of Statistical Distance Measures into Low Dimensional Spaces. Arnab Bhattacharya, Purushottam Kar, Manjish Pal. International Conference on Database and Expert Systems Applications (DEXA), 2009, pages 164-172, Linz, Austria.

  32. FTDP-17 Mutations in Tau Alter the Regulation of Microtubule Dynamics: An ''Alternative Core'' Model for Normal and Pathological Tau Action. Adria LeBoeuf, Sasha F. Levy, Michelle Gaylord, Arnab Bhattacharya, Ambuj K. Singh, Mary Ann Jordan, Leslie Wilson, Stuart C. Feinstein. Journal of Biological Chemistry, 2008, 283(52), pages 36406-36415.

  33. A General Modeling and Visualization Tool for Comparing Different Members of a Group: Application to Studying Tau-Mediated Regulation of Microtubule Dynamics. Arnab Bhattacharya, Sasha Levy, Adria LeBoeuf, Michelle Gaylord, Leslie Wilson, Ambuj K. Singh, Stuart C. Feinstein. BMC Bioinformatics, 2008, 9, page 339.

  34. Efficient Computation of Statistical Significance of Query Results in Databases. Vishwakarma Singh, Arnab Bhattacharya, Ambuj K. Singh. International Conference on Scientific and Statistical Database Management (SSDBM), 2008, pages 509-516, Hong Kong, China.

  35. MIST: Distributed Indexing and Querying in Sensor Networks using Statistical Models. Arnab Bhattacharya, Anand Meka, Ambuj K. Singh. International Conference on Very Large Data Bases (VLDB), 2007, pages 854-865, Vienna, Austria.

  36. Indexing Spatially Sensitive Distance Measures Using Multi-Resolution Lower Bounds. Vebjorn Ljosa, Arnab Bhattacharya, Ambuj K. Singh. International Conference on Extending Database Technology (EDBT), 2006, pages 865-883, Munich, Germany.

  37. LB-Index: A Multi-Resolution Index Structure for Images. Vebjorn Ljosa, Arnab Bhattacharya, Ambuj K. Singh. International Conference on Data Engineering (ICDE), 2006, pages 144-145, Atlanta, USA.

  38. ViVo: Visual Vocabulary Construction for Mining Biomedical Images. Arnab Bhattacharya, Vebjorn Ljosa, Jia-Yu Pan, Mark R. Verardo, Hyung-Jeong Yang, Christos Faloutsos, Ambuj K. Singh. International Conference on Data Mining (ICDM), 2005, pages 50-57, Houston, USA. (One of the top five student papers)

  39. ProGreSS: Simultaneous Searching of Protein Databases by Sequence and Structure. Arnab Bhattacharya, Tolga Can, Tamer Kahveci, Ambuj K. Singh, Yuan-Fang Wang. Pacific Symposium on Biocomputing (PSB), 2004, pages 264-275, Hawaii, USA.


Education:

  • Ph.D. in Computer Science, Dept. of Computer Science, University of California, Santa Barbara, CA 93106, USA. 2007.
  • M.S. in Computer Science, Dept. of Computer Science, University of California, Santa Barbara, CA 93106, USA. 2007.
  • Bachelor of Computer Science and Engineering (B.C.S.E.), Jadavpur University, Kolkata - 700032, India. 2001.
  • Higher Secondary Examination, West Bengal Council of Higher Secondary Examination, India. 1997.
  • Secondary (Madhyamik) Examination, West Bengal Board of Secondary Education, India. 1995.


Invited Talks:

  1. ''Trajectory Aware Service Location Problems'' at the NetApp Corporation, Bengaluru, India, 2016.
  2. ''Mining Statistically Significant Connected Subgraphs'' at the NetApp Corporation, Bengaluru, India, 2015.
  3. ''Mining Statistically Significant Substructures based on the Chi-square Statistic'' at IBM, New Delhi, India, 2015.
  4. ''Mining Statistically Significant Substrings based on the Chi-square Measure'' at the NetApp Corporation, Bengaluru, India, 2014.
  5. ''Mining Statistically Significant Substructures based on the Chi-square Statistic'' at the Indian Statistical Institute, Kolkata, India, 2014.
  6. ''Skylines: Databases' Answer to Multiple Preferences'' at the NetApp Corporation, Bengaluru, India, 2013.
  7. ''Skylines: Databases' Answer to Multiple Preferences'' at the Dept. of Computer Science and Engineering, Indian Institute of Technology, Kanpur, India, 2012.
  8. ''Finding the Bias and Prestige of Nodes in Networks based on Trust Scores'' at Yahoo! Labs, Bengaluru, India, 2011.
  9. ''Earth Mover's Distance: An Adaptable and Universally Applicable Distance Measure'' at the Dept. of Computer Science, Andhra University, Vishakhapatnam, India, 2010.
  10. ''Earth Mover's Distance: An Adaptable and Universally Applicable Distance Measure'' at Tata Consultancy Services (TCS), Gurgaon, India, 2010.
  11. ''On Earth Mover's Distance: A Spatially Sensitive Distance Measure'' at the Dept. of Computer Science, Free University of Bozen-Bolzano, Italy, 2009.
  12. Popular lecture on ''Game Theory'' at the Business Club meeting of the Indian Institute of Technology, Kanpur, India, 2009.
  13. ''Distributed Indexing and Querying in Sensor Networks using Statistical Models'' at the Dept. of Computer Science, Université Libre de Bruxelles, Belgium, 2008.


Patents:

  1. Multiple Criteria Decision Analysis
    • US patent number US8504581B2: 6th Aug, 2013
    • India patent number INDEL20123027A: 25th Apr, 2014
  2. Multiple Criteria Decision Analysis in Distributed Databases
    • Global patent number WO2015104591A1: 16th Jul, 2015


Important Courses Taught:

  1. Skyline Queries in Databases

  2. Indexing and Searching Techniques in Databases

  3. Data Mining

  4. Topics in Biocomputing

  5. Principles of Database Systems

  6. Fundamentals of Computing

  7. Computing Laboratory


Awards, Scholarships and Certificates:

  1. Best runner-up at XRCI Open 2015 for the poster ''Trajectory Aware Macro Cell Planning for Mobile Users''.
  2. IBM Faculty Research Award, 2014.
  3. Recipient of award from Yahoo! Faculty Research and Engagement Program, 2011.
  4. Best paper award at the International Conference on Management of Data (COMAD), 2011 for the paper ''Minimally Infrequent Itemset Mining using Pattern-Growth Paradigm and Residual Trees''.
  5. Best student paper award at the International Conference on Management of Data (COMAD), 2010 for the paper ''Aggregate Skyline Join Queries: Skylines with Aggregate Operations over Multiple Relations''.
  6. One of the top-five student paper awards at the International Conference on Data Mining (ICDM), 2005 for the paper ''ViVo: Visual Vocabulary Construction for Mining Biomedical Images''.
  7. ICDM Student Travel Award sponsored by IBM at the International Conference on Data Mining (ICDM), 2005 awarded to the top five student papers.


Major Sponsored Projects:

  1. Big Data from IBM, India, 2014-present.
  2. Development of Air Quality Index (AQI) for Indian Cities from Central Pollution Control Board (CPCB), India, 2014-2015.
  3. Extending Skyline Queries to Distributed and Uncertain Databases from DST, Govt. of India, 2013-present.
  4. Deciphering the BMP Signaling Network in Developing Bone from DBT, Govt. of India, 2013-present.
  5. Flash-Aware Optimizations for Columnar Databases from NetApp Corporation, 2011-present.
  6. Reputation Framework for Ad Networks from Yahoo! Research, 2011.


Professional Activities:

  • Program Chair for the 19th International Conference on Management of Data (COMAD), 2013.
  • Executive Member of the Computer Society of India's (CSI) Special Interest Group in Data (SIGDATA) since 2012.
  • Publication Chair for the 18th International Conference on Management of Data (COMAD), 2012.
  • Program Chair for the 8th International Conference on Simulated Evolution and Learning (SEAL), 2010.
  • Publicity Chair for the 14th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD), 2010.
  • Member of the CODATA National Committee since 2016.
  • Member of the Association for Computing Machinery (ACM) since 2010.
  • Member of the Institute of Electrical and Electronics Engineers (IEEE) since 2005.
  • Reviewer and Program Committee member for many international journals and conferences.


Toggle Menu