[Pitch] `defer` statement that runs only on error

Joe_Groff · December 22, 2021, 1:15am

The defer statement is a useful tool for guaranteeing the cleanup of resources when a scope is exited. Sometimes, it may be desirable to set up a scoped cleanup that runs if the scope is interrupted by an error, but not if the scope is exited successfully. For instance, you may be trying to perform a multiple-step transition between two valid states, where any of the multiple steps may fail, and want to make sure you don't get stuck in an invalid intermediate step. Right now, the facilities for expressing this are wanting. You can use a flag:

// Valid states are:
//    stateA = 0; stateB = 0; stateC = 0
//  -or-
//    stateA = 6; stateB = 7; stateC = 9
var stateA = 0
var stateB = 0
var stateC = 0
func transitionStates() throws {
  var didFinish = false

  stateA = try computeSix()
  defer { if !didFinish { stateA = 0 } }

  stateB = try computeSeven()
  defer { if !didFinish { stateB = 0 } }

  stateC = try computeNine()
  defer { if !didFinish { stateC = 0 } }

  try finishTransition()
  didFinish = true
}

but this leaves a lot in the hands of programmer discipline: as the code evolves, the programmer must remember to check if !didFinish in every cleanup, and must remember to set the flag on every path where the transition is ready to be committed. An arguably more disciplined way to manage this is with nested catch blocks:

func transitionStates() throws {
  stateA = try computeSix()
  do {
    stateB = try computeSeven()
    do {
      stateC = try computeNine()
      do {
        try finishTransition()
      } catch {
        stateC = 0
        throw error
      }
    } catch {
      stateB = 0
      throw error
    }
  } catch {
    stateA = 0
    throw error
  }
}

This pattern reduces the room for programmer error (though it doesn't eliminate it completely—the programmer still has to remember to re-throw the error after every catch block), but it's painful to look at, being a "pyramid of doom" with confusingly-nested reverse-ordered catch blocks, the same sorts of readability issues that led us to adopt Go-style defer instead of Java-style finally blocks in the original error handling model.

It would be nice if there was a form of defer that runs only when the scope is exited because of a thrown error. I went ahead and prototyped this idea, using the strawman syntax defer catch (a syntax I only used because it was easy to implement, and am in no way committed to). This allows the above example to be written straightforwardly:

func transitionStates() throws {
  stateA = try computeSix()
  defer catch { stateA = 0 }

  stateB = try computeSeven()
  defer catch { stateB = 0 }

  stateC = try computeNine()
  defer catch { stateC = 0 }

  try finishTransition()
}

These sorts of multi-step state transitions are still best avoided if at all possible, of course. For a simple value-type-based example like this, you could combine stateA, stateB, and stateC above into a single struct, have transitionStates operate on a local copy of the struct while making incremental changes, and then assign the final value back once the transition is complete. However, when working in constrained environments, or with existing APIs that require parts of a transaction to be done in multiple possibly-throwing calls, it may not always be possible to avoid exposing intermediate states. Having a defer-on-error construct would fill in a useful expressivity gap for these situations.

ksluder · December 22, 2021, 1:25am

In a real function, I expect it will be quite common to reach a point where running a previously deferred statement in response to an error would be incorrect, and potentially even compound the error.

Do you have an idea what percentage of complex error-handling functions would able to safely use defer catch statements as they are proposed here?

bnyu · December 22, 2021, 1:28am

There is a related proposed defer if
defer if catch {}
defer if true {}

Zhu_Shengqi · December 22, 2021, 1:48am

Joe_Groff:

func transitionStates() throws {
  stateA = try computeSix()
  defer catch { stateA = 0 }

  stateB = try computeSeven()
  defer catch { stateB = 0 }

  stateC = try computeNine()
  defer catch { stateC = 0 }

  try finishTransition()
}

Can we just write like this below:

func transitionStates() throws {
  stateA = 0
  stateA = try computeSix()

  stateB = 0
  stateB = try computeSeven()

  stateC = 0
  stateC = try computeNine()

  try finishTransition()
}

The only problem I see is that we need to do additional assignments to the state properties. Maybe this is the reason the proposal want to address the issue?

ktoso · December 22, 2021, 2:01am

This is quite nice

We actually had a similar thing implemented in distributed actors way back when... so I'm definitely in support of this, very nice to see it make a comeback in language proper (even though I guess we came up with it independently).

It definitely is quite useful to defer only the error path. The one we had back then didn't offer the error into the block; I wonder if this should though...? (Maybe it's just the catch making me think it would be nice to).

I'm assuming the usual interleaving of defer and defer catch would just work as expected etc.

Very nice, would like to see this polished up and pitched

wowbagger · December 22, 2021, 3:52am

Zhu_Shengqi:

Can we just write like this below:

func transitionStates() throws {
  stateA = 0
  stateA = try computeSix()

  stateB = 0
  stateB = try computeSeven()

  stateC = 0
  stateC = try computeNine()

  try finishTransition()
}

Writing it like this could potentially result in stateA = 6; stateB = 7; stateC = 0, which is an invalid state in the example.

1-877-547-7272 · December 22, 2021, 6:42am

I like this idea!

I'd also like to see a version of defer that could be used within failable initializers. The syntax could be something like defer nil {...} and it would only run if return nil is used. This could help with cleaning up resources in failable initializers since deinit isn't always called when you return nil.

A tangent about failable/throwing initializers and `deinit`

I think the current behavior of failable and throwing initializers is too confusing. Right now, deinit will be called if all stored properties are initialized, and deinit won't be called if only some stored properties are initialized. This causes some odd behavior — consider the following code:

class Foo {
    let pointer: UnsafeMutableRawPointer
    
    init?() {
        pointer = .allocate(byteCount: 1, alignment: 1)
        return nil
    }
    
    deinit {
        pointer.deallocate()
    }
}

class FooWithInteger {
    let pointer: UnsafeMutableRawPointer
    let integer: Int
    
    init?() {
        pointer = .allocate(byteCount: 1, alignment: 1)
        return nil
    }
    
    deinit {
        pointer.deallocate()
    }
}

If you call Foo(), then all memory is properly cleaned up. If you call FooWithInteger(), however, then you get a memory leak since the allocated memory isn't deallocated.

I think that it would be much easier to reason about code like this if deinit were only called if the object was actually initialized. If you use return nil or exit with throw, then deinit will not be called and you'll have to deal with cleanup explicitly.

kaan · December 22, 2021, 8:35am

Joe_Groff:

func transitionStates() throws {
  stateA = try computeSix()
  defer catch { stateA = 0 }

  stateB = try computeSeven()
  defer catch { stateB = 0 }

  stateC = try computeNine()
  defer catch { stateC = 0 }

  try finishTransition()
}

Just as a quick question, is there any specific reason why we have 3 separate defer catch statements in the example snippet? Couldn't this simply be written as the following?

func transitionStates() throws {
  defer catch {
    stateA = 0
    stateB = 0
    stateC = 0
  }

  stateA = try computeSix()
  stateB = try computeSeven()
  stateC = try computeNine()
  try finishTransition()
}

I'd also like to see defer catch becoming more aligned with the regular catch, as @ktoso alluded to, so that the error is either implicitly passed in, or can be matched against. i.e

defer catch {
  // Have access to error here implicitly
}

defer catch ComputationError.timedOut {
  // Only run if .timedOut is thrown 
}

tclementdev · December 22, 2021, 10:34am

Joe_Groff:

func transitionStates() throws {
  stateA = try computeSix()
  defer catch { stateA = 0 }

  stateB = try computeSeven()
  defer catch { stateB = 0 }

  stateC = try computeNine()
  defer catch { stateC = 0 }

  try finishTransition()
}

I'm confused by this. Does only the defer ~~following~~ preceding the throwing line run? Or the defer ~~following~~ preceding the throwing line + all the defers before it?

sveinhal · December 22, 2021, 11:00am

All blocks that are deferred up until the function returns, are executed, in reverse order.
This is the same as the current behavior. This only adds the catch keyword to conditionally execute the deferred block of code.

tclementdev · December 22, 2021, 11:03am

Thanks! I misread the do/catch code and thought not all preceding catch blocks executed, pyramid of doom is hard to read indeed.

sveinhal · December 22, 2021, 11:04am

kaan:

I'd also like to see defer catch becoming more aligned with the regular catch , as @ktoso alluded to, so that the error is either implicitly passed in, or can be matched against. i.e
defer catch {
  // Have access to error here implicitly
}

defer catch ComputationError.timedOut {
  // Only run if .timedOut is thrown 
}

I'd like for the defer blocks to be able to access the return value as well.

defer {
   // have access to return value here
}

Maybe

defer let result {
   // only executed on non-error paths?
   // result is return value
}

Idk. ¯\_(ツ)_/¯

Joe_Groff · December 22, 2021, 6:41pm

That's an interesting observation. You could express this sort of thing by using do scopes to enclose different transaction endpoints:

do {
  try doStepA()
  defer catch { try undoStepA() }
  try doStepB()
  defer catch { try undoStepB() }
  try commitStepsAandB()
}

// We can continue with step C without undoing A or B
do {
  try doStepC()
}

though people writing and maintaining the code still need to be aware enough to do that, and ensure the scope ends are placed correctly.

ksluder · December 22, 2021, 6:57pm

Would this preclude typed throws, since the catch might be deferred past several disjointly-typed try statements?

jrose · December 22, 2021, 8:58pm

This gets tricky since defer is executed when you leave the current scope, not the current function (unlike Go). The compiler can enforce that you only use the feature in scopes that always end with returning (or throwing), but I think it’s enough complexity that it would deserve its own pitch rather than getting rolled up into this one.

michelf · December 22, 2021, 9:34pm

I don't like "defer catch": it does not really catch the error, it simply suspends its propagation while executing some code. You don't have to throw the error again to continue propagation, which is very unlike a catch block.

I would suggest this instead:

defer failure { ... }

And perhaps also:

defer success { ... }

when exiting the scope without a failure.

xwu · December 22, 2021, 9:46pm

This is a very good point!

Instead of coming up with a different syntax, though, what if defer catch did have such behavior, and users could choose to rethrow the error or not on their way out of the scope?

michelf · December 22, 2021, 10:10pm

I think something like this could work in theory:

defer catch { print(error); throw error }

but it brings to front some questions I'd rather avoid people asking themselves:

can you throw a different error?
can you not throw an error?

I think point 2 is very problematic:

Swallowing the error becomes the default behavior. If you forget to write a throw in there it'll dramatically change the workflow.
Does it just jump at the end of the scope as if nothing was thrown? This sounds confusing, especially with nested scopes.
Swallowing the error might mean you need to return a value for the function. We'd need to allow return inside of defer catch and require it (or not) depending on how the surrounding scope is expected to exit.
Throwing and returning is not possible in a regular defer block, it'd only be possible in defer catch, which is inconsistent.

My opinion is that defer catch would make some sense if the compiler could make sure an error is always thrown at the end of the block (similar to how guard-else needs to exit the scope). But in 99% of cases the ability to throw a different error is not needed, so better relieve the syntax from having to express this. You can still use do-catch if you need to swap the error for another, and since there's no reverse order in how things happen in do-catch what code sees what error would be less confusing.

Saklad5 · December 22, 2021, 10:22pm

I’m a little concerned about the effects this would have on readability. Not enough to oppose it outright, but enough to be concerned.

On a related note, could we generalize this feature a bit? For instance, it might be nice to have a “catch and release” construct that automatically propagates errors, allowing conventional try-catch to easily fill a similar role.

It may also be nice to make this novel defer statement conditional on specific errors. Even if we don’t have typed throwing yet, that’d help achieve feature parity with try-catch. It could also make the syntax a bit more elegant:

defer { print("Runs when scope is exited") }
defer until Error { print("Runs when error is thrown") }
defer until SpecificError { print("Runs when error that pattern-matches SpecificError is thrown") }

mpangburn · December 22, 2021, 11:13pm

Joe_Groff:

func transitionStates() throws {
  stateA = try computeSix()
  defer catch { stateA = 0 }

  stateB = try computeSeven()
  defer catch { stateB = 0 }

  stateC = try computeNine()
  defer catch { stateC = 0 }

  try finishTransition()
}

I haven't often encountered code that requires this granularity of state resetting. While recognizing that this a toy example, I'm working to understand where this improves upon e.g.

func transitionStates() throws {
  do {
    stateA = try computeSix()
    stateB = try computeSeven()
    stateC = try computeNine()
    try finishTransition()
  } catch {
    (stateA, stateB, stateC) = (0, 0, 0)
    throw error
  }
}

My impressions are:

If the real-world replacement for stateA = 0 (i.e. a state reset) is expensive, unconditionally resetting all states regardless of the failure point could be more costly;
In the circumstance that (1) holds, the existing syntactic alternative is, as described:

However, I think the syntactic burden of try extends beyond the particular use case described here. Consider comparable code like this, which is deliberately un-nested:

func performSteps() -> Output {
  let resultA: ResultA
  do {
    resultA = try stepA()
  } catch {
    return fallbackA
  }

  let resultB: ResultB
  do {
    resultB = try stepB(resultA: resultA)
  } catch {
    return fallbackB
  }

  // ...
}

The existing mechanism to un-nest sequential throwing operations with custom per-failure-point catch handling introduces 1) an otherwise unnecessary type annotation and 2) a fair amount of syntax (do/try/catch plus two sets of braces) for what roughly amounts to a throwing guard statement.

This is recognizedly unergonomic ([1], [2], [3], [4], [5], ...), and I suspect a syntactic solution to this challenge would largely address the problem described here. Strawman:

func transitionStates() throws {
  stateA = try computeSix() catch { 
    stateA = 0 
    throw error
  }

  stateB = try computeSeven() catch {
    (stateA, stateB) = (0, 0)
    throw error
  }
  
  stateC = try computeNine() catch {
    (stateA, stateB, stateC) = (0, 0, 0)
    throw error
  }

  try finishTransition()
}

Although this requires some duplication of cleanup logic:

the order of operations is both unambiguous and linear (in contrast to sequenced defer blocks, which must be traced up backwards to interpret);
the syntactic burden is alleviated comparably to the proposed solution;
the example captured by performSteps() above benefits equally from the new syntax.