Research Page :

Research Interests :

 My research interests are in the areas of  data mining, web mining, machine learning, pattern recognition and rough set theory applications. I believe that all these areas are not different from each other, rather they are very much inter-related and overlapping in principle.

Availability of large databases in several domains like finance, scientific experiments, genomics and the web, as well as the growing interest of corporations to discover the knowledge embedded in these databases, have generated a wide interest in the subject of data mining. Among the prevailing perspectives of Data Mining, the Machine Learning approach is the most popular one. I am interested in applying machine learning, pattern recognition algorithms and statistical techniques to data mining problems by addressing the scalability issues and their integration with the relational databases. I work on rough set theory applications for pattern recognition, data mining and bio-informatics related problems. Recently after reading the Ph.D thesis 'Link Analysis Ranking, Panayiotis Tsaparas, 2004' , i developed a strong interest towards web mining, topic distillation and information retrieval. I am presently working on 'Semantic web' for my masters thesis.

I am more interested in developing working systems  than developing algorithms and writing papers. During one of the talks, Dr. Pravin Bhagwat said " There is no use of writing papers which are read only by the authors because they have written it and the reviewers because they are forced to read ! ". I agree with his views.

Exploration and contribution are two sides of the coin called 'research'. Having explored enough, i am now ready to contribute to my beloved field ! 


Research work done in past :

  •   " Development of OCR System for Devanagri Script " with Prof. B. B. Choudhuri , IEEE Fellow, visiting faculty from ISI Calcutta ( Jan 05 - May 05 )
  •    " A Rough Association Rule based Approach for Class Prediction with Missing Attribute Values " with Dr. Pabitra Mitra  ( Oct 04 - May 05 )
  •  " Object Extraction in Gray-scale images using Roughness Measure of a Fuzzy Set " with Dr. Pabitra Mitra and Dr. Mohua Banerjee  ( Jan 05 - Jun 05 )
 

 Works currently in Progress :  

  • Masters Thesis : "Semantic Web Search based on Topic Maps"  Supervisor : Prof. T. V. Prabhakar
  • " Digital Ecosystem for Agricultural and Rural Livelihood ( DEAL ) " , Media Lab Asia Project with Prof. Jayanta Chatterjee, IME Dept and Prof. T. V. Prabhakar
  • " Scaling Machine Learning and Data Mining Algorithms for Large Datasets "  in collaboration with Deepanjan Kesh, Ph.D student.

            Among the prevailing perspectives of Data Mining, the Machine Learning approach is the most popular one. While applying machine learning, pattern recognition algorithms and statistical techniques to data mining problems, the main issue to be dealt is 'Scalability' and their integration with the relational databases. In this work, we are trying to address this scalability issue by using the concepts of 'Cached Sufficient Statistics', 'Data Summarization' and  'Data Squashing', a very nice sampling technique different from the usual sampling done. We also want to try some randomized approaches for data summarization. The two most popular in this regard that we have explored are sampling and random linear combinations, more popularly known as 'Sketches'.


 Some Doctoral Theses i read and excited about : 

  • " Certain Pattern Recognition Tasks for Data Mining " ,  Pabitra Mitra, 2003. 
  • " Link Analysis Ranking ", Panayiotis Tsaparas, 2004.