Use of @convention(thin) to avoid allocating closures

In SwiftNIO we have a somewhat common pattern of defining state machines that want to manage calling out to user code. Specifically, when an event occurs we want to spin a state machine, and then once we've transitioned the internal state we want to fire a sequence of notifications. These notifications may potentially lead to re-entrant calls to the state machine, and as this state machine is implemented as a struct we need to get the mutating call to the structure off the call stack or we'll have exclusivity violations on the structure.

An example of this code is here.

To avoid the non-exclusive mutation of the structure we currently return a closure from the state transition functions. That leads to functions with a signature like this:

internal mutating func activate(promise: EventLoopPromise<Void>?) -> ((ChannelPipeline) -> Void)

The closure returned here closes over no state internal to the strucutre: all it does is call some functions on its argument. For example:

{ pipeline in
    pipeline.fireChannelRegistered0()
}

The problem is that with this pattern, Swift will by default heap allocate a block for this closure. This means each call to the function will cause a heap allocation and ARC traffic, even though the body of this function is entirely static and could easily just be a C function pointer.

Our current solution to avoiding this allocation is to use @inline(__always), but this is naturally not something we're very happy with. While investigating alternatives today we re-stumbled across @convention(thin). We did some quick investigation and noted that this appears to compile down to an assembly representation that treats the closures as code, and simply returns a function pointer that is then called.

My question is this: is this the expected behaviour of @convention(thin), and is that convention working well today?

6 Likes

One of the points of closures being non escaping by default is that they shouldn't require heap allocations. I think the issue is that no-one working on performance optimizations has gotten around to doing this transformation.

@Erik_Eckstein do you know what the status of this is? This is the sort of thing that should "just work".

5 Likes

Non-escaping closures I agree with, but this closure does escape, as it's the return value of the function. I just want a way to signal to the Swift compiler that the "closure" closes over no state, and can easily be transformed to a block of code in the .text section of the binary and returned as a function pointer.

Are you suggesting the compiler ought to be able to do this automatically? Because that seems unlikely to me: in general the Swift compiler seems to think that the return value of functions that return functions are heap allocated refcounted objects, and it seems intuitive to me that asking for the other behaviour needs to be part of the return type. @convention(thin) would seem to be the perfect candidate.

https://bugs.swift.org/browse/SR-904

another option would obviously be that you can terminate the 'mutable access' (and therefore the exclusivity requirement) somehow. Suboptimal syntax but something like

mutating func foo() {
    /* do some mutating work here */
    withoutActuallyMutating {
        /* no mutations in here; exclusive access terminated before this */
        calloutToUser()
    }
}

@convention(thin) is not officially supported yet and may cause compiler crashes if not used in the narrow circumstances it happens to work by accident today. If a closure truly closes over no state, its context pointer should be null, and retain/release operations should early-exit on it. There'll be some code size and branching overhead from this, but shouldn't be any atomic memory access overhead, at least. If your signatures happen to be C-compatible, you might be able to use @convention(c) instead, which is more completely implemented.

If you write the capture list of a closure explicitly, it will capture the given bindings by value rather than by reference, so there'll be no dynamic exclusive access to the original variable.

1 Like

Sorry, my message was a bit misleading: I thought as an alternative to a closure, if we could terminate the mutating access somehow within the mutating func, then we wouldn't even need to return a closure. The problem is this:

mutating func foo() {
    self.mutateSomeState()
    calloutToUser()  // problem: might call `foo` again which would violate the exclusive access requirement
}

as an alternative we usually do

mutating func foo() -> () -> Void {
    self.mutateSomeState()
    return {
        calloutToUser()
    }
}

and the caller just does foo()() and then foo calls can't be recursive anymore and everything's fine. The problem is that that might allocate (if it's not inlined) and therefore we always write:

@inline(__always)
mutating func foo() -> () -> Void {
    self.mutateSomeState()
    return {
        calloutToUser()
    }
}

which solves the problem. Cory was asking if this would solve the problem too:

mutating func foo() -> @convention(this) () -> Void {
    self.mutateSomeState()
    return {
        calloutToUser()
    }
}

and I thought if there were some way to terminate mutable access, we don't even need to return a closure in the first place and could write:

mutating func foo() {
    self.mutateSomeState()
    let `self` = self /* terminate the mutating access (doesn't work) */
    calloutToUser()  // hypothetically this could not be fine even if the user calls `foo` recursively as we don't have a mutable `self` anymore and therefore technically we don't require the exclusive access restriction. In reality however this doesn't work and you'll still get a crash.
}

Unfortunately there's no way for the code inside a mutating function to end the access, since asserting exclusivity is the caller's responsibility before calling into a mutating function. If invocations of foo from calloutToUser are intended to modify the same value, then you should pass the new value off to calloutToUser as an inout argument to modify directly:

mutating func foo() {
  self.mutateSomeState()
  calloutToUser(&self)
}

which will hand off the exclusive access from the mutating function to calloutToUser.

2 Likes

thanks Joe, that's exactly what we need!

Oh sorry, I misunderstood you. I agree that something like convention(thin) makes sense for that.

What about in more complex callback scenarios? For example, if the structure containing the method foo is embedded in a class, and calloutToUser wants to invoke a method on that class that will attempt to mutate the structure containing foo?

It seems that the pattern of passing &self to the callout is great if the structure is the only object that needs to be mutated, but if the mutation of the struct is intended to be an invisible side-effect of another function call then the pattern no longer seems to work.

Is that right, or have I misunderstood?

For a more complex callback scenario, you might want to lift foo up to be a method on the containing class rather than have it as a mutating method on the struct itself. That way, exclusivity will be dynamically imposed on the value when needed, instead of being asserted from outside the method.

Alternatively, if the state mutation is supposed to be incidental to the semantic value of the struct, then foo shouldn't be mutating at all. You could put the incidental state in a class instance owned by the struct so that it can be modified without formally mutating the struct.

Yeah, I thought about that.

Unfortunately, it leads to what I'd consider sub-optimal code organisation. As a real-world example, consider this code, which is a perfect example of the pattern I'm discussing (in fact, it's the first instance of the problem we hit: the fact that I hit a second is what prompted this post).

This structure is logically related to its parent class. It only makes sense alongside the BaseSocketChannel defined in the file. However, it's extremely helpful from a clarity perspective to move those methods off the class itself and into a helper struct that is only responsible for moving the state machine.

If extensions could declare their own stored properties then I'd probably just rewrite this to be an extension on BaseSocketChannel, but in this case the best I can do is hoist up the vars on that struct into the BaseSocketChannel class and then move all these methods to an extension. It doesn't really lead to ideal code organisation, IMO, which is why we've been looking for an alternative.

That said, it's definitely understandable if you want to say that the way to do it in Swift is just to hoist up the code, and that there's no intention of supporting this "function pointer" view of a closure.

To be clear, I'm not saying anything about our intent regarding fully supporting thin function pointers in the future. To me, returning a callback and expecting the user to invoke it right away also seems like a suboptimal way of sidestepping the borrow model, since the caller could just as easily neglect to invoke the function, or store it away and invoke it later at some arbitrary time, and I'm trying to understand whether there's a better way to model what you're trying to do within the existing language model, or whether the language model is "wrong" and needs to be changed.

1 Like

So as a relevant bit of feedback, we do that in at least one place on purpose, here. Reproducing the code for ease of reading:

// Transition our internal state.
let callouts = self.lifecycleManager.close(promise: p)

if let connectPromise = self.pendingConnect {
    self.pendingConnect = nil
    connectPromise.fail(error: error)
}

// Now that our state is sensible, we can call out to user code.
self.cancelWritesOnClose(error: error)
callouts(self.pipeline)

In this case, the work we have to do divides into stages. The first is making the appropriate state machine transition so that we are in the correct internal state. This must be done first, because a critical failure mode of state-machine based programming is calling out to arbitrary code while the state machine has only partway performed a state transition.

The next bit is handing state that is not related to the general state transition, but is specific to this action. That is, regardless of what state transition we just performed, we need to do this work. This work involves calling out to user code, so it must be done after the state transition, but our contract states that it happens before the other callouts do.

Then, and only then, do we fire the callouts appropriate to the state transition. Exactly what these callouts are depends on what state we were in before we performed the state transition. It's handy that the method that triggers the state transition does not need to know exactly what those callouts are, but can still ensure that it is safe to re-enter before those callouts fire.

I completely agree that in general relying on users to invoke a returned function to make something happen is a dangerous pattern. However, it can be a convenient way of arranging an ordering of events such that a complex, stateful, potentially re-entrant system can gracefully manage its internal state.

If the callout behavior is dependent only on the previous state, perhaps you could return the previous state instead of a closure, and put a method on the state type to perform the callouts. You could maybe factor this pattern into a higher-order method:

func withStateTransition(_ transition: (inout LifecycleManager) -> Void, do body: () -> Void) -> Void {
  let oldState = self.lifecycleManager.state
  transition(&self.lifecycleManager)
  body()
  oldState.callouts(self.pipeline)
}

It's not dependent only on the previous state, but the previous state plus the trigger. You are right though that we could split the operation in half: one part spins the state but does not trigger the callouts, another triggers the callouts appropriate for that state transition. It's a bit annoying becuase a bunch of comparisons have to be repeated (we essentially need to do the state transition switch twice, but one time we throw away the result, which are the callouts), but it would definitely work.