BitMat

BitMat is an ongoing project, originally developed as a part of my Ph.D. thesis. The proposed algorithms
and system indexes an RDF graph using compressed bit-vectors, and
processes SPARQL Basic Graph Pattern queries, using a novel 2-phase query processing algorithm.
The algorithm gives tighter upper bounds on the memory consumption than the conventional join query processors.
See the
source code of the project.
Relevant publications:
- Medha Atre: BitMat-mcore: SPARQL Query Processing with Multi-core BitMat, work in progress.
- Medha Atre: Algorithms and Analysis for the SPARQL Constructs, under submission (arXiv).
- Gurkirat Singh*, Dhawal Upadhyay*, Medha Atre: Efficient RDF Dictionaries with B+ trees, CoDS-COMAD 2018 (PDF).
- Medha Atre: For the DISTINCT clause of SPARQL queries, WWW Posters Track, 2016 (PDF).
- Medha Atre: Left Bit Right: For SPARQL Join Queries with OPTIONAL Patterns (Left-outer-joins), SIGMOD 2015 (PDF).
- Medha Atre, Vineet Chaoji, Mohammed J. Zaki, James A. Hendler: Matrix "Bit"loaded: A Scalable Lightweight Join Query Processor for RDF Data, WWW 2010 (PDF) (Presentation) (BitMat source code).
- Gregory Williams, Jesse Weaver, Medha Atre, James A. Hendler: Scalable Reduction of Large Datasets to Interesting Subsets, Journal of Web Semantics (Special Issue: Science, Services and Agents on the World Wide Web), 2010 (winner of the 2009 Billion Triple Challenge, ISWC, October 2009) (Paper)
- Medha Atre, James A. Hendler: BitMat: A Main Memory Bit-matrix of RDF Triples, in SSWS workshop at ISWC 2009 (PDF).
- Medha Atre, Jagannathan Srinivasan (Oracle), James A. Hendler: BitMat: A Main-memory Bit Matrix of RDF Triples for Conjunctive Triple Pattern Queries, ISWC Poster and Demo track, October 2008 PDF (first runner up among 85 poster/demos)
Real time coordinated video surveillance

This is an ongoing project with my coPI
Prof. Tanaya Guha, to do
anomaly detection in real-time streaming surveillance video content
without relying purely on
deep-learning based computer vision techniques.
Large Scale Cross-Modal Media Retrieval

This is an ongoing project along with my co-PI
Prof. Tanaya Guha, bringing in together
large scale media data management and indexing, and
Machine Learning techniques
for
cross-modal media retrieval, which has important applications in several fields like
education, medicines, emotion detection.
Graph Path Queries

With the advent of the web, graphs have become richer where edge labels represent
the type of relationship between two nodes which are connected by that edge. Exploring paths in
the graphs has been a well studied problem in the context of data like XML, but general purpose
graphs, it is a
hard problem due exponential number of possible paths.
E.g., the RDF graph of DBLP data with 13 million edges and 5 million nodes has more than 10
25
distinct paths.
In our work, we have focused on a different type of
path pattern and
constrained reachability queries.
Relevant publications:
- Work in progress about path pattern query optimization using BitMat like indexing technique.
- Medha Atre, Vineet Chaoji, Mohammed J. Zaki: BitPath -- Label Order Constrained Reachability Queries over Large Graphs, CoRR, March 13, 2012, (PDF).