Units: 3-0-0-0 (9)
The primary objective of the course is to discuss the principles and practices of the design of the contemporary multi-core and multiprocessor architectures.
This course studies the principles and practices of multi-core and multiprocessor design. It introduces students to the broad topics such as cache coherence, memory consistency models, synchronization primitives, on-chip interconnection networks, and performance pathologies of shared memory parallel programs.
|Sl. No.||Broad Title||Topics||No. of Lectures
(Each 75 minutes)
|1.||Introduction||Multi-cores: why and what; Moore’s law; Dennard scaling||2|
|2.||Fundamentals of memory system||Virtual memory; address translation hardware; SRAM and caches; DRAM and main memory||6|
|3.||Tools and techniques for evaluating architectures||Simulation; dynamic binary instrumentation; performance counters; use of special instructions such as cupid of x86||3|
|4.||Introduction to shared memory multiprocessors and multi-cores||Types of architectures; problem of cache coherence; specification of cache coherence protocols as a set of invariants; basics of memory consistency models||5|
|5.||Shared memory synchronization||Hardware support for efficient synchronization; interplay of cache coherence, speculative execution, and synchronization primitives; implementation of efficient locks and barriers||3|
|6.||Performance analysis of shared memory parallel programs||Brief introduction to shared memory parallel programming techniques: POSIX thread model, OpenMP, fork/mmap; performance pathologies of shared memory parallel programs; influence of cache coherence and synchronization||3|
|7.||Scalable cache coherence||Directory-based coherence protocols and their implementation; case study of SGI Origin 2000 protocol||3|
|8.||Memory consistency models||Sequential consistency, total store order, partial store order, processor consistency, weak ordering, release consistency||2|
|9.||Interconnection networks||Topologies, integrated router design, routing techniques for networks on chip, interplay of deadlockfree routing and cache coherence; virtual channels||2|