Preventing data races using dependent types

Nickolas_Pohilets · December 22, 2020, 12:51am

I've been researching the topic of preventing data races using type system for a toy programming language some time ago, and found this paper - Type-Safe Multithreading in Cyclone. It describes and provides a proof of soundness for a safe and very expressive race-free type system.

In particular it allows:

Expressing that graph of objects is protected by the same lock, without storing a reference to that lock in any of the objects.
Safe round-tripping reference to an object between multiple queues - references can be passed around freely, but can be dereferenced only on the queue which the object belongs to. Even closures capturing local variables can be round-tripped.
Generalizing over locks - some or all of the class data can be protected by the lock specified by the client, rather than class itself. And the lock may vary per instance.
When combined with move-only types can be extended to allow object graphs moving between locks. For example, object graph can be created and modified on a single thread, after that made immutable and shared between several threads.

Paper describes only exclusive locks, but approach can be easily generalised to read-write locks.

In general, a lock can be:

global singleton Immutable lock - always implicitly locked for reading, cannot be locked for writing. Objects protected by this lock effectively become immutable.
current thread/queue - an equivalent of self - always implicitly locked for writing in the current function, but not assumed to be locked if captured in a closure.
NSLock, pthread_mutex_t, pthread_rwlock_t, DispatchQueue
actors

I wonder if this approach was considered by the core team or anyone else? If there is an interest in this, I can write a proposal.

To give some context, here is description of the approach in my own words using Swift-like syntax. It's not a proposal, more an introduction into the idea. Syntax probably can be made less verbose assuming some things by default, but I'll keep everything explicit to make things more clear.

For starters, I need some syntax to associate variables with locks. For the sake of discussion, let it be a keyword under - var amount: Int under L, where L is a lock name.

Next, lock name L needs to be declared somehow. Lock names are processed during type-checking, and don't exist in the runtime (well, almost*). In this regard, locks are similar to types, so I'm gonna use generic-like syntax for declaring lock names.

class Account<L: LockName> {
    var amount: Int under L
}

We can read variables when corresponding lock is held for reading or writing, and write them when corresponding lock is held for writing. At each point of the program, there is a set of locks known to be held, and each variable access is checked against locks in the that set. Lock set is initialised with the locks mentioned in the function signature. I'll introduce keywords read and write for this:

class Account <L: LockName> {
    var amount: Int under L

    // Caller of this function must ensure that L is held for writing.
    // Inside this function we assume that L is held for writing.
    func deposit(value: Int) write L {
        amount += value // OK
    }

    func badDeposit(value: Int) {
        amount += value // error: mutating variable `amount` requires lock `L` to be held for writing.
    }

    // `self` and `from` are both guarded by the same lock.
    // We don't know what this lock is, but we know it is the same.
    func transfer(value: Int, from: Account<L>) write L {
        from.amount -= value // OK
        amount += value // OK
    }

    // You can operate on multiple lock names
    func looseTransfer<L2: LockName>(value: Int, from: Account<L2>) write L, L2 {
        from.amount -= value // OK
        amount += value // OK
    }
}

But where do lock names come from? Each lock instance have a unique lock name. New lock name is allocated when new lock object is created.

Let's use dispatch queues as locks. But I'll need few changes in the queues API - serial and concurrent need to be different classes. And barrier methods need to be separate methods with different signature.

First, we need to express that each instance of dispatch queue is associated with a lock name.

First attempt - class ConcurrentDispatchQueue<L: LockName> does not work. Now we need to pass the lock name when creating the queue:

let q = ConcurrentDispatchQueue <???>()

That's what with want with Account, but not with the ConcurrentDispatchQueue. Instead we want constructor of the ConcurrentDispatchQueue to give us the lock name. I'll introduce a keyword new to express this:

class ConcurrentDispatchQueue <new L: LockName> {
    ...
}

Now we can write let q = DispatchQueue(), but what is the type of q? It cannot be DispatchQueue, because that's not a complete type, it's an archetype. We get the full type, we need a lock name in the type somewhere. But in this case, lock name is erased. What tool is Swift using for type erasure? Existentials! So q is of existential type:

let q: any<L: LockName> = ConcurrentDispatchQueue()

To make use of this lock name, we need to open this existential:

let <L1: LockName> q1: ConcurrentDispatchQueue <L> = ConcurrentDispatchQueue()
var a11: Account<L1> = Account<L1>()
var a12: Account<L1> = Account<L1>()
// New queue instance gives us a new lock name
let <L2: LockName> q2: ConcurrentDispatchQueue <L> = ConcurrentDispatchQueue()
var a2: Account<L2> = Account<L2>()

a11 = a12 // OK, same type
a11 = a2 // error, type mismatch

To use something with the accounts, we need to obtain the queue's lock. Methods async() and barrierAsync() will help us here:

class ConcurrentDispatchQueue <new L: LockName> {
    func async(_ block: @escaping () read L -> ()) { ... }
    func barrierAsync(_ block: @escaping () write L -> ()) { ... }
}

...

let <L1: LockName> q1: ConcurrentDispatchQueue <L> = ConcurrentDispatchQueue()
var a11: Account<L1> = Account<L1>()
var a12: Account<L1> = Account<L1>()
q1.async {
    a11.tranfer(value: 100, from: a12) // error: method Account.tranfer() requires lock `L1` to be held for writing, but it is held for reading
}
q1.barrierAsync {
    a11.tranfer(value: 100, from: a12) // OK
}

*As I mentioned before, normally lock names are processed as symbols during type checking and do not exist in the runtime. But there is a case, where some runtime may be needed:

class Derived<L1: LockName, L2: LockName> : Base<L1> {}

let b: Base<LA> = Derived<LA, LB>()
let c: Derived<LA, LC>? = b as? Derived<LA, LC>

This can be banned, or handled in a special way using explicit existentials over lock names.

codafi · December 22, 2020, 8:15am

There’s more to this system that is missing from this explanation. Our generics system requires variance over locks with respect to other locks. For example, a generic function composition operation that is correctly locked is:

func compose<A: UniqueA, B: UniqueB, C: UniqueC>(_ f: (B) locks(B) -> C, _ g: (A) locks(A) -> B) -> (A) locks(A; B) -> C
  where A <= B, B <= C
{
  ...
}

Where “locks(A)” is their shorthand for “require the signature of f to take all of the locks A takes”, and the locking order has to be written out by hand. And this signature doesn’t even implement the most gnarly form - the callee locked one - which now requires bounds on locks(...) as well!

This gets really hairy really quickly. It’s not just dependent types here - we’d probably need to build higher-rank polymorphism, another effect system, variance, kind subtyping, global lock names for top-level statics, sealing and unsealing effects...

codafi · December 22, 2020, 8:43am

I mean, don’t get me wrong, this is really neat research. But from a practical standpoint, this is a whole lotta typing rules for surprisingly little payoff. They can’t guarantee progress, they can’t guarantee deadlock freedom (not that you should in the strictest sense but you made me write out a locking order by hand!), and they note this system will misbehave if dynamic downcasts aren’t backed by runtime checks.

Nickolas_Pohilets · December 22, 2020, 11:23pm

Not sure I'm following you. Where did the lock ordering came from? It is not part of the paper I've linked.
Lock ordering may be needed for catching deadlocks in the type system. But let's consider this off topic and focus on data races. For preventing data races lock ordering is not needed.

About kind subtyping. I've re-read the paper, and found out that I memorised some things incorrectly, and my explanation of handling thread-local data does not match with the paper. The paper treats current thread lock name (loc, probably for local) as a global constant and provides an actual dummy lock object for the current thread. But that leads to a problem that when you have a dummy object corresponding to the current thread, you can pass it to another thread and try to lock it from another thread. To mitigate this, paper introduces "sharable" and "unsharable" kinds. Which leads to kind subtyping and breaks round-tripping.

While, I was describing a different scheme:

There is no lock object corresponding to the current thread/queue, only the lock name. Function inc2() from Figure 1 cannot be called with thread-local data.
Lock name for the current thread/queue is not a constant, but a an implicit parameter to each function. Closures and inner functions introduce new parameter that shadows parameter from the outer scope.

The following code would produce an error without additional kinds:

func f() {
    var x: Int = 0
    let q = ConcurrentDispatchQueue()
    q.async {
        x += 1 // error: x can be mutated only on the thread that called `f`.
    }
}

To understand how, let's make current thread lock names explicit:

func f<SelfThread: LockName>() write SelfThread {
    var x: Int under SelfThread = 0
    typealias OuterThread = SelfThread
    let q = ConcurrentDispatchQueue()
    q.async { f<SelfThread: LockName>() write SelfThread in
        x += 1 // error: mutating x requires `OuterThread` to be held for writing, but `SelfThread` is a different lock here
    }
}

This approach allows round-trip and does not require additional kinds. Only kinds for Type and LockName are needed.