Programming for Performance

CS 610, Semester 2020-2021-I, IIT Kanpur


Instructor Information

Name Swarnendu Biswas
Email swarnendu AT cse.iitk.ac.in
Class hours WedFri 10:35-11:50 AM (online, asynchronous)
Discussion hours Wed 10:30-11:30 AM (online, synchronous)

TA Information

Name Email (AT cse.iitk.ac.in)
Fahad Mohmedisuf Shaikh fahad
Samvid Mistry samvid
Sharwari Samdekar sharwari
Vipin Patel vipinpat

Course Description

To obtain good performance, one needs to write correct but scalable parallel programs using programming language abstractions like threads. In addition, the developer needs to be aware of and utilize many architecture-specific features like vectorization to extract the full performance potential. In this course, we will discuss programming language abstractions with architecture-aware development to learn to write scalable parallel programs.

This course will involve programming assignments to use the concepts learnt in class and appreciate the challenges in extracting performance.

Prerequisites
  • Exposure to the following courses (or equivalent) is desirable: CS210 (Computer Organization), CS330 (Operating Systems), and CS422 (Computer Architecture).
  • Programming maturity with popular programming languages like C, C++, and Java.


   Course Policies and Syllabus   |   Academic Integrity   |   Evaluation Scheme   |   Resources   |   References   


Course Policies and Syllabus

Policies

Syllabus

The​ course​ will​ ​primarily focus on the ​following topics.

We will have several guest lectures during the semester. Dr. Nitya Hariharan, who is a Senior Application Engineer at Intel, will lead discussions on OpenMP. Dr. Sanket Tavarageri, who is a Research Scientist as Intel Labs, will lead discussions on Polyhedral Compilation framework.

Feedback

I am open to constructive feedback about the course content and presentation. Feel free to provide suggestions for improvements.


Academic Integrity


Evaluation Scheme


Resources

Date Topic Resources Recommended Reading
02/09 Course Overview Slides
First Course Handout
02/09 Compiler Challenges for Parallel Architectures Slides AK Chap 1
04/09, 09/09 Write Cache-Friendly Code Slides Cache Miss Analysis Example
CSAPP Chap 6
HP APP B
11/09, 16/09 Dependence Analysis Slides AK Chap 2
16/09, 18/09, 23/09 Loop Transformations Slides AK 5.2-5.4, 5.7.2, 5.9, 6.2.1, 6.2.2, 6.2.5, 6.3.1-6.3.4
AP 4.1, 4.2, 4.5, 5.1-5.6
HP 4.5
  POSIX Threads Slides PP Chapter 4 (IITK has subscribed to the ebook)
LLNL Pthreads Tutorial
OSTEP Thread API, Condition Variables
25/09, 30/09, 02/10 Vectorization Slides Guide for Intel Compilers
Cornell Virtual Workshop on Vectorization
07/10, 09/10, 21/10 OpenMP OpenMP Basics
OpenMP Advanced
PP Chapter 5 (IITK has subscribed to the ebook)
OpenMP Application Programming Interface v5
LLNL OpenMP Tutorial
23/10 Polyhedral Compilation Slides
28/10, 30/10, 04/11 Intel TBB Slides TBB Chapters 2, 3, 5, 6, 7, 9
TBB Documentation
06/11, 11/11, 13/11, 18/11, 20/11, 25/11, 27/11 GPU Architecture and CUDA Programming Slides KH Chapters 1-6 (IITK has subscribed to the ebook)
NVIDIA CUDA C Programming Guide
NVIDIA CUDA C Best Practices Guide


References

I have listed (NOT in any particular order) a few popular references. We may read and discuss related materials and research papers which we will announce in class.