## Rethinking Support for Region Conflict Exceptions

Swarnendu Biswas, Rui Zhang, Michael D. Bond, and Brandon Lucia

**IPDPS 2019** 

#### C++ Program with Data Race

X\* x = NULL; bool done= false;

**Thread T1** 

Thread T2



#### Catch-Fire Semantics in C++

C++ treats data races as errors

X\* x = NULL;
bool done= false;

Thread T1





#### Catch-Fire Semantics in C++



X \*x = NULL;
bool done= false;

#### **Thread T1**

#### Thread T2

x = new X();
done = true;

while (!done) {}
x->func();



#### **Thread T1**

#### Thread T2





#### **Thread T1**

#### Thread T2



#### KILLED BY A MACHINE: TI THERAC-25



#### research highlights

#### Technical Perspective Data Races are Evil with No Exceptions

By Sarita Adve

EXPLOITING PARALLELISM HAS become the | racy code. Java's safety requirement primary means to higher performance. | preclude the use of "undefined" behavior

The Therac-25 was not a device anyone was happy to see. It was a radiation therapy machine. In layman's terms it was a "cancer zapper"; a linear accelerator with a human as its target. Using X-rays or a beam of electrons,

GeoStar Aug. 2003 EST 14 SEARCH How to miscompile programs with "benign" data races Hans-J. Boehm HP Laboratories

003/45/

#### SUBSCRIBE

Enter Email Address

SUBSCRIBE

## Need for Stronger Semantics for Programs with Data Races

Adve and Boehm, CACM'10

"The inability to define reasonable semantics for programs with data races is not just a theoretical shortcoming, but a fundamental hole in the foundation of our languages and systems."

"We call upon software and hardware communities to develop languages and systems that enforce data-race-freedom, ..."

### What Do We Mean by Strong Semantics?

# End-to-end guarantees even for programs with data races

## Outline

Impact of Data Races on Language Models

Strong Semantics with Region Conflict Exceptions

**Providing Region Conflict Exceptions** 

ARC: Practical Architecture Support for Region Conflict Exceptions

Comparison of ARC with Related Approaches

Strong Execution Semantics with Region Conflict Exceptions

#### C++ Program with Data Race

X\* x = NULL; bool done= false;

**Thread T1** 

Thread T2



#### Data Race Exceptions



#### **Region Conflicts**



time



### **Region Conflicts**

time





## Semantics with Region Conflict Exceptions



## Semantics with Region Conflict Exceptions



## Providing Region Conflict Exceptions

## Providing Region Conflict Exceptions

#### Valor: Efficient, Software-Only Region Conflict Exceptions\*

#### Conflict Exceptions: Simplifying Concurrent Language Semantics with Precise Hardware Exceptions for Data-Races

Biswas et al. Valor: Efficient, Software-Only Region Conflict Exceptions. OOPSLA 2015. Lucia et al. Conflict Exceptions: Simplifying Concurrent Language Semantics With Precise Hardware Exceptions for Data-Races. ISCA 2010.

## Drawbacks with Conflict Exceptions

#### Builds on top of M(O)ESI-style cache coherence

- Introduces hardware on top existing structures
- Increases complexity

#### Inter-core communication at region boundaries

- Metadata in private cache lines are forwarded to other cores
- Increases on-chip interconnect bandwidth requirement

#### Private line evictions communicate with memory

- Relies on in-memory backup for evicted metadata
- Increases off-chip memory bandwidth requirement

Lucia et al. Conflict Exceptions: Simplifying Concurrent Language Semantics With Precise Hardware Exceptions for Data-Races. ISCA 2010.

## ARC: Practical Architecture Support for Region Conflict Exceptions

**Design Overview** 

Architectural Modifications

Example Executions with ARC

## ARC: Practical Architecture Support for Region Conflict Exceptions

• Design Overview

Architectural Modifications

Example Executions with ARC

### Baseline Architecture in ARC



#### Baseline Architecture in ARC





## Release Consistency

core's private cache waits to write back its dirty data until a synchronization release operation

X = new Object(); done = true; unlock(m);

time

lock(m);
while (!done) {}
X.compute();

#### Self-Invalidation

core invalidates private cache lines that may be **out-of-date** at synchronization acquire operations

# **ARC**: Our Proposed Technique for Region Conflict Detection

Explore whether synergistic use of release consistency and self-invalidation can be competitive

Provide consistency and coherence at SFR boundaries and on private cache line evictions



time



time

## Serializability of Regions

A region appears serializable if:

There were no conflicts

Writes appear atomic

Values read are consistent

## Region Boundary Operations in ARC



## Region Boundary Operations in ARC

A region appears serializable if:

There were no conflicts

Writes appor ensure consistency Values read are consist At a region boundary, an ARC core executes:

**Pre-commit** – Write back dirty lines to the LLC

**Read validation** – Validate reads using version and value validation

### Region Boundary Operations in ARC

A region appears serializable if:

There were no conflicts

Writes appear atomic

Values read are consistent

provide coherence At a region boundary, an ARC core executes:

**Pre-commit** – Write back dirty lines to the LLC

**Read validation** – Validate reads using version and value validation

**Post-commit** – Clear per-core metadata, self-invalidate private lines

### ARC: Practical Architecture Support for Region Conflict Exceptions

Design Overview

Architectural Modifications

Example Executions with ARC

### Detecting Sound and Precise Conflicts



Private cache line

### Modifications Introduced by ARC



### Metadata Management



### Access Information Memory (AIM)

#### AIM is a dedicated metadata cache adjacent to the LLC

#### AIM lines in ARC can be large

- 100 bytes for 8 cores, 178 bytes for 16 cores, and 308 bytes for 32 cores
- Impractical to have large AIM cache structures

ARC assumes a realistic AIM design with 32K entries



### ARC: Practical Architecture Support for Region Conflict Exceptions

Design Overview

Architectural Modifications

• Example Executions with ARC



























# Comparison of ARC with Related Approaches

Implementation and Evaluation

### Comparing Conflict Exceptions and ARC

#### **Conflict Exceptions**

- Builds on M(O)ESI
  - Coherence at granularity of memory accesses
- Requires support for a Directory and point-to-point communication
- Detects conflicts eagerly

#### ARC

- Adapts release consistency and self-invalidation schemes
  - Coherence at region granularity
- Requires a AIM cache, write signatures, and consistency controllers
- Uses a mix of eager and lazy conflict detection

Lucia et al. Conflict Exceptions: Simplifying Concurrent Language Semantics With Precise Hardware Exceptions for Data-Races. ISCA 2010.

### Implementation and Evaluation

- Simulation
  - A Pintool generates a stream of memory and synchronization events
  - Events are processed by model implementations of Conflict Exceptions (CE) and ARC
  - Use McPAT to estimate energy usage

### Run-time Performance

normalized to CE-4



### Run-time Performance

normalized to CE-4



### Run-time Performance

normalized to CE-4



### Energy Usage

normalized to CE-4



## Overhead of Providing Region Conflict Detection

 Current shared-memory systems provide undefined semantics for racy programs

|            | Overhead comparison at 32 cores |                  |
|------------|---------------------------------|------------------|
| Approaches | Run-time<br>performance (%)     | Energy usage (%) |
| CE         | 26.7                            | 41.4             |
| ARC        | 12.5                            | 27.8             |

### KeyRelease consistency and self-invalidation techniquesTakeaways!can be a good fit for detecting region conflicts

Small metadata cache provides reasonable tradeoffs between performance and complexity

Compared to state-of-art, ARC shows promise in making region conflict detection practical

### Rethinking Support for Region Conflict Exceptions

Swarnendu Biswas, Rui Zhang, Michael D. Bond, and Brandon Lucia

**IPDPS 2019**