explicit-state

Table of Contents

1 Introduction

Declarative Programming: describing "what" to compute, without describing how to compute it. Usually stateless. Imperative Programming: describing "how" to compute. Usually includes state.

Advantages of declarative programming include

  • Easier to build abstractions (procedural abstractions, higher order functions)
  • Easier to test (referential transparency - a function always returns the same values for the same arguments). In programming with state, we have to test sequences of execution.
  • Reasoning is easier (algebraic reasoning is possible).

1.1 State

We will deal with explicit state, in a sequential setting.

Principle of abstraction: Programs can be separated into specification and implementation. Specification: A contract defining how the system should behave Implementation: How the specification is realized.

Declarative and imperative programming both support this. However, when a system "grows", declarative programs have to be changed. The only way for a declarative program to know more about the external world, is by having extra arguments (for example, more accumulators). This means that not only the implementation, but the specification also changes.

Having internal state within the program helps us to model internal state as extra information within the program while keeping the specification the same as before. So imperative program supports this kind of abstraction better.

A system that supports the principle of abstraction has

1.1.1 Encapsulation

It should be possible to hide the internal details of a part.

E.g. A complex number data type can represent its data either in polar coordinate system or in a rectangular coordinate system, if its operations are correct.

Example of a feature that supports encapsulation: Lexical scoping: variables inside a lexical scope are inaccessible by entities outside the scope.

1.1.2 Compositionality

Possible to combine parts to make a new part.

e.g. The programmer should be able to build more complex data structures where their components themselves are user-defined types.

1.1.3 Instantiation/Invocation

Multiple instances of the same definition can be created. These instances may be different based on the environment in which they were instantiated/invoked.

A concrete example: Classes and objects. The objects, when initialized could take arguments based on the point where they are created.

A feature that supports instantiation: higher-order programming.

1.1.4 Invariant

With states, execution can have side effects. This makes reasoning about the correctness of programs, difficult. One way to alleviate this is by using encapsulation. Each encapsulation can be defined to have a property called an invariant, which is always true when viewed from the outside.

1.1.5 Progress

In addition to invariants (what is constant), we need a proof that the system is progressing towards a goal (what changes).

1.2 Component-based Programming

Characterized by encapsulation, compositionality and instantiation.

Each component is a part of the system.

Specifies an interface: how the component is seen from the "outside".

In component-based programming, we can compose components: build a new component that contains the original one.

2 Names

For the discussion, we will need a new concept in Oz called names. Names are constants which are unique in the system. Only two operations are allowed on them.

{NewName} % returns a fresh name
N1==N2    % Compares N1 and N2

The name created by a call to NewName is guaranteed to be unique in the system.

It is not possible, for example, to convert a name to a printable string which is displayed using Browse. The only way to know a name is to be passed it as a reference.

We will use names to define cells, and to secure data in the following discussion.

3 Explicit State

In declarative model, a component's behaviour is dependent only on its arguments. With explicit state, the behaviour depends on the internal state of the component as well.

Advantage of explicit state - can lead to shorter argument lists for components. Advantage of declarative model - we have substitute functions with their values without changing the behaviour of programs (usually called referential transparency)

Definition A state is a sequence of values in time that contains the intermediate values of a computation.

3.1 Implicit State

We can model state, even in the declarative style.

fun {SumList Xs S}
    case Xs
    of nil then S
    [] X|Xr then {SumList Xr X+S}
end

The code above is a declarative code which sums up the elements in a list. Consider the call {SumList [1 2 3] 0}. The last argument to the recursive calls takes the sequence of values 0, 1, 3, 6. This sequence can be seen as a state as per the definition.

3.2 Explicit State

Definition An explicit state in a procedure is a state whose lifetime extends over more than one procedure call, without being one of its argumnents.

Explicit state cannot be expressed in the declarative model. For that, we need a new concept called a cell.

local
   C = {NewCell 0}
   SumList
in
    proc {SumList Xs}
        case Xs
        of nil then skip
        [] X|Xr then
           C:=@C+X
           {SumList Xr}
        end
    end
    {SumList [1 2 3 4]}
    {Browse @C}
end

There are two operations allowed on cells

  • := sets the content of the cell
  • @ accesses the content of the cell

4 Declarative Model with explicit state

Concurrent components like stream objects and ports already introduce state into the declarative model. We will now add state explicitly into the model. The program model is sequential.

Explicit state is a pair of language entities: first, the state's identity and the second is the state's current content.

There is a mapping from the state's identity to its content. This mapping can be modified - this is what introduces explicit state. Interestingly, neither the identity nor the current content can be modified. Modifying the mapping produces the same effect as the modification of the state's content.

4.1 Cells

Definition A cell is a pair of a name, and a reference into the SAS.

Names are unforgeable, hence a cell is a secure ADT.

We introduce a new store called the mutable store. The cells are part of the mutable store.

There are now two stores: the SAS, which is immutable, and the mutable store of the cells.

.....................     .......................
.                   .     .                     .
.                   .     .                     .
.    w:10           .     .    cell1: w         .
.    x:2            .     .    cell2: x         .
.                   .     .                     .
.                   .     .                     .
.                   .     .                     .
.                   .     .                     .
.....................     .......................
   IMMUTABLE STORE        MUTABLE STORE consisting of cells

We add two new statements to the declarative sequential model of Chapter 2, to handle cells.

4.2 NewCell

The statement that creates a cell is

{NewCell Cell InitialValue}

Semantics

In addition to the single assignment store (and the trigger store if we consider lazy evaluation), we create a mutable store. This store is initially empty. The mutable store contains cells which are pairs of the form x:y where x and y are variables of the SAS. x is always a name. y can be any partial value.

The execution state is now a tripe (Stack, SAS, Mutable).

The semantics for the ({NewCell Cell InitialValue}, Env) statement is:

  • Create a new name n in SAS
  • Bind E(Cell) and n in SAS
  • If binding succeeds, add the pair E(Cell):E(InitialValue) in the Mutable store
  • Otherwise, raise an error.

4.3 Exchange

The statement that modifies the value of a cell is:

{Exchange Cell OldValue NewValue}

(OR)

OldValue=(Cell:=NewValue)

This statement atomically binds OldValue to the old content of Cell, and sets the new content of Cell to be NewValue. Another syntax for the exchange statement is using the := operator. Usually we do not care about the old value of the cell, so the syntax simply reads

Cell:=NewValue

Semantics

The semantics of ({Exchange Cell OldValue NewValue},E) is

+If the activation condition, E(Cell) being determined, is false, suspend execution. Otherwise, we do the following.

  1. If E(Cell) is not bound to the name of a cell, then raise an error
  2. If the mutable store contains E(Cell):w, then do the following
    • Update the mutable store to E(Cell):E(NewValue)
    • Bind E(OldValue) and w in the store.

4.4 Memory Management

We just have to modify the definition of reachability to deal with the new operations. In addition to the previous cases, a variable y is reachable if the mutable store contains x:y and x is reachable.

If a variable x becomes unreachable, and the mutable store contains x:y, we remove this pair from the MS. (Note that y is not necessarily reclaimed in SAS.)

Reversing a list with cells:

fun {Reverse Xs}
   R = {NewCell nil}
in
   for X in Xs do Rs := X|@Rs end
   @Rs
end

4.5 Sharing and Equality

A cell is semantically, a pair of the form x:y. Hence we now have to distinguish the equality of cells from the equality of their contents.

The concept of sharing or aliasing is when two identifiers refer to the same cell.

X = {NewCell 0}
Y = X            % Y is an alias of X
Y := 10
{Browse @X}      % displays 10

When any alias changes the content of the cell, all the aliases see the changed content. This makes reasoning about the program difficult (In the previous example, if you just consider the statements in which X appears, you cannot compute the current value of X.)

4.5.1 Token and Structure Equality

Two values are equal iff they have the same structure. For example, the variables in the following are equal.

X = rec(f:1)
Y = rec(f:2)
{Browse X==Y}    % displays true

This is structure equality. With cells, we also introduce the concept of token equality - two cells are equal iff they are the same cell.

X = {NewCell 0}
Y = {NewCell 0}
{Browse X==Y}    % displays false
{Browse @X==@Y}  % displays true. However, this may change subsequently
Y:=1
{Browse @X==@Y}  % displays false now

Z = X
{Browse X==Z}    % displays true

*

5 Data Abstraction

A data abstraction is a way to define the abstract behaviour of data independent of its specific implementation. This leads to better modularity in the overall program, making it easier to reason about and to make changes in speicific parts of the program without affecting other parts.

The book talks about three axes of designing data structures.

  • Open vs. Secure
  • Unbundled (ADT) vs. Bundled (e.g. Classes)
  • Declarative vs. using Explicit State.

All choices among these are possible, leading to eight different ways of abstracting data. We will consider a stack implementation, and consider some ways of doing this.

5.1 Open Unbundled Declarative stack

Suppose we implement a stack as a list, which supports operations for pushing, popping, and to check emptiness. We might implement it in the following manner.

local
  fun {NewStack} nil end
  fun {Push E S} E|S end
  fun {Pop S} if {isEmpty S}==false then S.1 end
  fun {IsEmpty S} S==nil end
in
  local X Y in
    X = {NewStack}
    {Browse {Pop {Push 2 {Push 1 X}}}} % displays 2
end

This implementation, where there is no encapsulation of the data and the procedures acting on it, is called an unbundled implementation or an Abstract Data Type (ADT).

The disadvantage is that if a programmer guesses that the stack is implemented as a list, (s)he can call {Length S}, which is not an operation that is meant to be performed on a stack.

5.2 Secure Unbundled Declarative stack

How can we ensure that a user goes only through the procedures that are provided in the ADT? We can use names for this purpose.

We will first introduce the notion of a wrapper/unwrapper pair of functions. To secure some value X, what we will do is to "lock" X inside a function, with a name - we call this they "Key". This function expects a key as input. If the given key is correct, we get the value of X.

Unwrap is a function which can unlock the wrapped version of X by giving the correct key.

Two things together ensure security: NewName returns a unique name, and the Key thus obtained is hidden from the outside by lexical scoping.

declare

proc {NewWrapper ?Wrap ?Unwrap}
    Key={NewName} % Unique key.
in
    % returns a function, which serves as
    % the "wrapped version" of X
    fun {Wrap X}
       fun {$ K} 
         if K==Key then X end 
       end
    end

    % function which takes a wrapped input and
    % returns the unwrapped value
    fun {Unwrap W}
       {W Key} % Calls wrapped version with the correct key
    end 
end

% Usage
S = [1 2 3]             
SecureS = {Wrap S}    % Note: SecureS is a function!
T = {Unwrap SecureS}  % T = [1 2 3]
                      % {Unwrap {Wrap S}}==S
% {Length SecureS} will fail

Using the wrap/unwrap pair, we can now implement a secure unbundled stack.

local Wrap Unwrap in
    {NewWrapper Wrap Unwrap}

    % Note that the wrapped stack is a function, not a list.

    fun {NewStack} {Wrap nil} end 
    
    fun {Push E S} {Wrap E|{Unwrap S}} end

    % Pop binds the top of the stack in E and returns the smaller
    % stack. 
    fun {Pop S ?E}
        case {Unwrap S}
        of H|T then 
            E=H
            {Wrap S1}
        end
    end

    fun {IsEmpty S} {Unwrap S}==nil end
end

Now, the only way a user can mainpulate the stack is through the set of four functions above. A call of the form {Length {NewStack}} will fail.

5.3 Secure Bundled Stateful stack

The above two implementations were "unbundled" implementations. We can implement declarative bundled implementations as well. We will change two design choices at the same time and consider a secure, bundled, stateful stack. We will see that we do not need names here: security is ensured just by lexical scoping.

fun {NewStack}
    C={NewCell nil}
    proc {Push E} C:=E|@C end
    fun {Pop} case @C of X|S1 then C:=X1 S end end
    fun {IsEmpty} @C==nil end
in
    stack(push:Push pop:Pop isEmpty:IsEmpty}
end

6 Polymorphism

An procedure is polymorphic if it can operate on arguments of different types. There are two main kinds

  • Generic : The same procedure body can operate on different argument types.
  • Ad hoc : There are different procedure bodies for different kinds of arguments. This is also called overloading in Object-oriented languages.

Though polymorphism can be supported in any paradigm, it is easier to define for object-oriented languages.

We give two examples of polymorphic code, one with the secure unbundled implementation of a collection, and the other with an open bundled declaration.

TBD