

# CS698Y: Modern Memory Systems Lecture-1 (Introduction)

### Biswabandan Panda

biswap@cse.iitk.ac.in





### **Course Staff**

Instructor: Biswa (Biswabandan, Sir, Prof., Dr., Er., \*-Biswa)

Website: www.cse.iitk.ac.in/biswap

Contact: KD 203, <a href="mailto:biswap@cse.iitk.ac.in">biswap@cse.iitk.ac.in</a>

Office Hours: Mon/Thurs: 12 noon, by appointment

Teaching and Research Interests: Computer Architecture and Systems

### What, When, and Where of CS698Y

When: Mon/Thurs 10.30-12 Hrs, Where: KD 103, What: You know it

Course website: www.cse.iitk.ac.in/~biswap/CS698Y.html

Piazza: For online discussions

**Submission of Assignments: Canvas** 

Register ASAP (Wait till you see the next few slides)

# Let's Get Started









# What is the takeaway?



#### Read this?







### **Prerequisite**

**Instruction pipelining** 

LOAD/STORE, PC

Cache, L1/L2

Tag/Index/Offset

**Direct/Associative mapping** 

**SRAM/DRAM** 

Latency/Throughput

Virtual/Physical address

**Process/Thread** 

**Programming in C/C++** 

**Score yourself** 

10 - Expert 5 - Knowledgeable 0 - No Knowledge

#### Your score

### What is Expected From You?

No open-screens (no nomophobics): No open smart-phones (phones) & laptops/tablets. Keep your phones in silent mode.

Open-screens will affect (distract) you, your friends, and me.

Ask questions & participate in in-class discussions (worth 10 points ©)

Paper reading and writing reviews/reports

Understand, implement, and analyze ideas

Slides will not contain everything. So attend lectures.

### **Memory System**



### Why Memory Systems (CS698Y)?



### Why Memory Systems (CS698Y)?



**Latency - 100s of cycles** 

**Energy and Power** 

Fixed bandwidth - GB/sec

**Reliability and Security** 

**Capacity - GBs** 

Fairness and Quality of Service

### **LATENCY**

**Performance** 

### **Interference @Shared Resources**



#### **Trend on Core Count**





Original data up to the year 2010 collected and plotted by M. Horowitz, F. Labonte, O. Shacham, K. Olukotun, L. Hammond, and C. Batten New plot and data collected for 2010-2015 by K. Rupp

### **Multicores**





### **Multicores**





**Heterogeneous cores (iphone 7)** 

### Multi to Many

# IBM's brain like chip: 4096 cores



Source: IBM

# 1000 core chip from UC Davis



Source: UC Davis

??



Source: The Verge

### Multi to Many

# IBM's brain like chip: 4096 cores



Source: IBM

# 1000 core chip from UC Davis



Source: UC Davis

#### **Datacenter @Facebook**



Source: The Verge

# Ideally ??



Source: Erlang@SinaWeibo

### **Interference??**



Source: Erlang@SinaWeibo

#cores doubling every two years. More interference??

### **CAPACITY & BANDWIDTH**

**Data-intensive Applications** 

# **Trend on Memory Capacity & Bandwidth**



Source: Lim et al., ISCA 2009

### **ENERGY & POWER**

**Energy bills & heating** 

# RELIABILITY

**Loss of data** 

### **Row-Hammer Problem**

```
CPU
loop:
 mov (X), %eax
  mov (Y), %ebx
  clflush (X)
  clflush (Y)
 mfence
  jmp loop
```

Source: Kim et al., ISCA 2014

#### **Row-Hammer Problem**



Source: Kim et al., ISCA 2014

### **Row-Hammer Problem**



| MODEL                  | #DEVICES | #VULNERABLE |
|------------------------|----------|-------------|
| ARMv7 (32-bit) devices |          |             |
| LG Nexus 4             | 1        | 1           |
| LG Nexus 5             | 15       | 12          |
| Motorola Moto G (2013) | 1        | 1           |
| Motorola Moto G (2014) | 1        | 1           |
| OnePlus One            | 2        | 2           |
| Samsung Galaxy S4      | 1        | 1           |
| Samsung Galaxy S5      | 2        | 1           |
| ARMv8 (64-bit) devices |          |             |
| HTC Desire 510         | 1        | 0           |
| Lenovo K3 Note         | 1        | 0           |
| LG G4                  | 1        | 1           |
| LG Nexus 5X            | 1        | 0           |
| Samsung Galaxy S6      | 1        | 0           |
| Xiaomi Mi 4i           | 1        | 0           |

Source: fossbytes.com

# **SECURITY**

**Information leakage** 

# **Side/Covert-channel Attack**



Source: Pinterest

# **FAIRNESS & QOS**

Fair Slowdown

**Minimum Performance Guarantee** 

# **Ideal Memory Systems**

Latency - L

Bandwidth - H

Capacity - H

**Energy and Power - L** 

Reliable and Secure – H (not at the cost of previous four)

Fairness and Quality of Service - H

Cost ??

## CS 698Y is not a

**Theory course** 

Digital/analog (©) circuit level course

Course on HDL/VHDL modeling of memory systems

**Course on microprocessors** 

**Caches** 



**Caches** 



**DRAM** 



**Caches** 



**DRAM** 



Memory stacking



NPU

**VERTICAL STACKING (3D)** 

**INTERPOSER STACKING (2.5D)** 

**Caches** 



**DRAM** 



Memory stacking



**VERTICAL STACKING (3D)** 



**INTERPOSER STACKING (2.5D)** 

## **Processing in Memory**



**Caches** 



**DRAM** 



**Memory** stacking



#### **Processing in Memory**



## Secure and reliable **Cache and DRAM**







# Takeaways (learning objectives) from CS698Y

**Understand and appreciate** the memory systems

Analyze and evaluate the performance (memory system) bottlenecks

**Research** on memory systems

# **Learning Points**

#### Option-I:

 $30 = (3 \times 10) = 3$  programming assignments

 $40 = (2 \times 20) = Quiz 1.0$  and Quiz 2.0 (Optional Quiz 1.1

and Quiz 2.1) = max (Quiz 1.x) + max (Quiz 2.x)

 $20 = (2 \times 10) = 2$  paper reviews

10 = Classroom and Piazza participation

Bonus points for finding typos in slides

## **Option-II:**

 $30 = (3 \times 10) = 3$  programming assignments

 $20 = (1 \times 20) = \max (Quiz 1.0, Quiz 1.1)$ 

 $30 = (1 \times 30) = 1$  research project (weekly meetings)

 $10 = (2 \times 5) = 2$  paper reviews

10 = Classroom and Piazza participation

Iterative Assessment

Choose your option wisely ©

## **NEXT TWO LECTURES**

## **BASICS OF PROCESSOR & CACHE**

**ASSIGNMENT-0** 

Submit it by tonight (11:59 PM)

"It takes two to speak the truth - one to speak and another to hear" - Henry David Thoreau

# Thank You& Havea Goodday