Tech: C, MPI, Slurm
- Analyzed large 3D spatio-temporal datasets for local/global extrema over time.
- Designed 3D Cartesian domain decomposition with halo exchanges via custom MPI datatypes.
- Replaced sequential I/O with MPI collective I/O for scalability; non-blocking comms for speed.
- Up to 7× speedup on CDAC PARAM Rudra.
Tech: CUDA, C++, AtomicCAS, Warp-level primitives, Python, Streamlit
- Implemented GPU-resident concurrent stack with atomic-CAS and invalidation markers.
- Added warp-level push–pop elimination and custom kernels to reduce contention.
- Built Streamlit analyzer to visualize metrics and validate correctness.
Tech: Python, Pandas, Selenium, Streamlit, Plotly, PyDeck, UMAP, KMeans
- Scraped and processed data for 9000+ trains across 45k+ tables.
- Interactive dashboards: delay hotspots, route analysis, clustering, revenue vs footfall, cleanliness.
- Deployed as Streamlit web app and Android APK for commuters and authorities.
Tech: PyTorch, CUDA, ADIOS2 (BP4, SST), Slurm
- GPU-accelerated workflow with efficient checkpointing and tensor streaming via ADIOS2.
- Real-time data sharing between compute nodes coupled with AI training.
Tech: YOLOv7, Flask
- Real-time gesture detection using YOLOv7 with Flask backend.
- Supports live and image-based ASL detection modes.
Tech: Streamlit, Keras, XGBoost, Joblib
- Predicts score after 6 overs using XGBoost; interactive Streamlit UI.
- Deployed for real-time user interaction.