I've been researching the topic of preventing data races using type system for a toy programming language some time ago, and found this paper - Type-Safe Multithreading in Cyclone. It describes and provides a proof of soundness for a safe and very expressive race-free type system.
In particular it allows:
- Expressing that graph of objects is protected by the same lock, without storing a reference to that lock in any of the objects.
- Safe round-tripping reference to an object between multiple queues - references can be passed around freely, but can be dereferenced only on the queue which the object belongs to. Even closures capturing local variables can be round-tripped.
- Generalizing over locks - some or all of the class data can be protected by the lock specified by the client, rather than class itself. And the lock may vary per instance.
- When combined with move-only types can be extended to allow object graphs moving between locks. For example, object graph can be created and modified on a single thread, after that made immutable and shared between several threads.
Paper describes only exclusive locks, but approach can be easily generalised to read-write locks.
In general, a lock can be:
- global singleton
Immutable
lock - always implicitly locked for reading, cannot be locked for writing. Objects protected by this lock effectively become immutable. - current thread/queue - an equivalent of
self
- always implicitly locked for writing in the current function, but not assumed to be locked if captured in a closure. -
NSLock
,pthread_mutex_t
,pthread_rwlock_t
,DispatchQueue
- actors
I wonder if this approach was considered by the core team or anyone else? If there is an interest in this, I can write a proposal.
To give some context, here is description of the approach in my own words using Swift-like syntax. It's not a proposal, more an introduction into the idea. Syntax probably can be made less verbose assuming some things by default, but I'll keep everything explicit to make things more clear.
For starters, I need some syntax to associate variables with locks. For the sake of discussion, let it be a keyword under
- var amount: Int under L
, where L
is a lock name.
Next, lock name L
needs to be declared somehow. Lock names are processed during type-checking, and don't exist in the runtime (well, almost*). In this regard, locks are similar to types, so I'm gonna use generic-like syntax for declaring lock names.
class Account<L: LockName> {
var amount: Int under L
}
We can read variables when corresponding lock is held for reading or writing, and write them when corresponding lock is held for writing. At each point of the program, there is a set of locks known to be held, and each variable access is checked against locks in the that set. Lock set is initialised with the locks mentioned in the function signature. I'll introduce keywords read
and write
for this:
class Account <L: LockName> {
var amount: Int under L
// Caller of this function must ensure that L is held for writing.
// Inside this function we assume that L is held for writing.
func deposit(value: Int) write L {
amount += value // OK
}
func badDeposit(value: Int) {
amount += value // error: mutating variable `amount` requires lock `L` to be held for writing.
}
// `self` and `from` are both guarded by the same lock.
// We don't know what this lock is, but we know it is the same.
func transfer(value: Int, from: Account<L>) write L {
from.amount -= value // OK
amount += value // OK
}
// You can operate on multiple lock names
func looseTransfer<L2: LockName>(value: Int, from: Account<L2>) write L, L2 {
from.amount -= value // OK
amount += value // OK
}
}
But where do lock names come from? Each lock instance have a unique lock name. New lock name is allocated when new lock object is created.
Let's use dispatch queues as locks. But I'll need few changes in the queues API - serial and concurrent need to be different classes. And barrier methods need to be separate methods with different signature.
First, we need to express that each instance of dispatch queue is associated with a lock name.
First attempt - class ConcurrentDispatchQueue<L: LockName>
does not work. Now we need to pass the lock name when creating the queue:
let q = ConcurrentDispatchQueue <???>()
That's what with want with Account
, but not with the ConcurrentDispatchQueue
. Instead we want constructor of the ConcurrentDispatchQueue
to give us the lock name. I'll introduce a keyword new
to express this:
class ConcurrentDispatchQueue <new L: LockName> {
...
}
Now we can write let q = DispatchQueue()
, but what is the type of q
? It cannot be DispatchQueue
, because that's not a complete type, it's an archetype. We get the full type, we need a lock name in the type somewhere. But in this case, lock name is erased. What tool is Swift using for type erasure? Existentials! So q
is of existential type:
let q: any<L: LockName> = ConcurrentDispatchQueue()
To make use of this lock name, we need to open this existential:
let <L1: LockName> q1: ConcurrentDispatchQueue <L> = ConcurrentDispatchQueue()
var a11: Account<L1> = Account<L1>()
var a12: Account<L1> = Account<L1>()
// New queue instance gives us a new lock name
let <L2: LockName> q2: ConcurrentDispatchQueue <L> = ConcurrentDispatchQueue()
var a2: Account<L2> = Account<L2>()
a11 = a12 // OK, same type
a11 = a2 // error, type mismatch
To use something with the accounts, we need to obtain the queue's lock. Methods async()
and barrierAsync()
will help us here:
class ConcurrentDispatchQueue <new L: LockName> {
func async(_ block: @escaping () read L -> ()) { ... }
func barrierAsync(_ block: @escaping () write L -> ()) { ... }
}
...
let <L1: LockName> q1: ConcurrentDispatchQueue <L> = ConcurrentDispatchQueue()
var a11: Account<L1> = Account<L1>()
var a12: Account<L1> = Account<L1>()
q1.async {
a11.tranfer(value: 100, from: a12) // error: method Account.tranfer() requires lock `L1` to be held for writing, but it is held for reading
}
q1.barrierAsync {
a11.tranfer(value: 100, from: a12) // OK
}
*As I mentioned before, normally lock names are processed as symbols during type checking and do not exist in the runtime. But there is a case, where some runtime may be needed:
class Derived<L1: LockName, L2: LockName> : Base<L1> {}
let b: Base<LA> = Derived<LA, LB>()
let c: Derived<LA, LC>? = b as? Derived<LA, LC>
This can be banned, or handled in a special way using explicit existentials over lock names.