[Pitch] Opt-in Strict Memory Safety Checking

Joe_Groff · January 9, 2025, 7:18pm

I might be misunderstanding what you meant by

because, at least in my imagination, it seems like that could only ever mean functions that transit unsafe pointer values without using them as pointers themselves. If the function is using memory in a verifiably-safe way, then it seems like, at least in the fullness of time, that function should communicate that by using safe language features like nonescapable types and borrow/inout rather than use UnsafePointer at all.

fclout · January 9, 2025, 8:07pm

I'm advocating that we shouldn't do that. It's one of two things we could change if we wanted to allow something like proc_name(pid, unsafe &buffer, buffer.count) with no unsafe in front of proc_name and I think it's not worth it. I'm also not coming up with any reason that you would continue to accept unsafe pointers when you're using them safely (you would use Span or inout or borrows).

stevapple · January 10, 2025, 8:01pm

geoffreygaren:

A function or type level opt out of safety is too coarse grained to meet our stated goal to “identify those places in Swift code that make use of unsafe language constructs and APIs”.

For context, Rust originally implemented function-level unsafety granularity, but came to regret it. I don't mean to say that we should copy Rust; but I do think that we should learn from their experience.

You could argue that it is bad style to write large unsafe functions or types, but:

I have found in code review that it only takes a few lines of code before I lose track of which named variables have safe types and which have unsafe types -- and when an expression uses type inference, or temporaries, or a type that has some safe interfaces and some unsafe interfaces, I actually don't know at all which expressions are safe or unsafe.

If we are really going to stand behind writing small unsafe functions and types as an essential component of our memory safety strategy, then we need a proposal for enforcing a maximum character and/or line length (of course, nobody wants that -- there are other, better options to improve granularity here, but that's what a self-consistent position would require)

I think this should be documented into the proposal as why it chose unsafe expression over @safe(unchecked). I was disliking the change when I first came across Doug’s new commits, but your words convinced me immediately.

Douglas_Gregor · January 10, 2025, 8:06pm

I think it's still sound: we will get a safety-related diagnostic for the code that introduces the unsafety, just not at every call site that can then reach that unsafe code. It's similar to passing a function containing unsafe code along to some safe code that will call it, e.g.,

func doSomething(_ f: ([Int]) -> Void) {
  // "safe" call that could end up calling unsafe code.
  f([1, 2, 3, 4, 5])
}

doSomething { array in
  // flag unsafe code here where it is introduced
  array.withUnsafeBufferPointer { buffer in
    print(buffer[0])
  }
}

We don't prevent you from calling doSomething with a closure containing unsafe code, nor do we call doSomething unsafe. Rather, the unsafe code is diagnosed in the closure itself, and then the unsafety is "erased" by passing the closure along as a value of function type.

The subclass override is analogous: the unsafety is "erased" by the override, so we need to make sure that any code that would make it possible to introduce the unsafety is diagnosed. Making the subclass @unsafe is a conservative way to do so. A narrower way to do this would be to diagnose the conversion to the superclass, but implementing that is a harder for annoying reasons, and I expect this issue not to come up that often.

Yes, you're right. Proposal updated!

Doug

stevapple · January 10, 2025, 8:15pm

I felt slightly worried for the style alignment across the language with the current design. unsafe and @unsafe look similar yet not symmetric with Sendable/await, that is:

Sendable as marker protocol but @unsafe as attribute, also resulting in @Sendable and @unsafe using different cases;
nonisolated as a keyword to opt out of sendability but @unsafe as an attribute;
~~async-await as pair of keywords to indicate use of concurrency, but unsafe corresponds to an @unsafe attribute~~ acceptable because await is also used for actor isolation that doesn’t require async to be presented.

Douglas_Gregor · January 10, 2025, 8:31pm

Sendable, nonisolated, and async all have representation in the type system, whereas @unsafe does not. I acknowledge that programmers might expect consistency here and be surprised when it isn't, but I think the inconsistency is important: most declaration attributes don't change the type of the declaration, and that's the behavior we want for @unsafe.

Doug

Douglas_Gregor · January 10, 2025, 8:52pm

Based on your comments here and additional motivation provided by @geoffreygaren and others, I've come around to agree with your position here. Subsequently, I've updated the proposal and implementation to introduce an unsafe expression:

unsafe <expression>

that suppresses diagnostics about any uses of unsafe constructs within its subexpression, just like try and await acknowledge the use of throwing or asynchronous constructs in their subexpressions. With this change, my sum example looks like this:

extension Array<Int> {
  func sum() -> Int {
    unsafe withUnsafeBufferPointerSimplified { buffer in
      unsafe c_library_sum_function(buffer.baseAddress, buffer.count, 0)
    }
  }
}

The proposal PR has been updated to reflect this change, and I've put up an implementation as well. Toolchains are building now, and I'll update this post with those links once they're ready.

I did a bit of work to make sure the experience of adopting this mode is decent. We'll always produce a single warning for each expression that uses unsafe code, and call out all of the places where there is unsafety in that expression. On the command line, that looks a bit like this:

13 |     // warning: use of unsafe function 'withUnsafeBufferPointerSimplified'
14 |     unsafe withUnsafeBufferPointerSimplified { buffer in
15 |       c_library_sum_function(buffer.baseAddress, buffer.count, 0)
   |       |                      |      |            |      `- note: reference to unsafe property 'count'
   |       |                      |      |            `- note: reference to parameter 'buffer' involves unsafe type 'UnsafeBufferPointer<Int>'
   |       |                      |      `- note: reference to unsafe property 'baseAddress'
   |       |                      `- note: reference to parameter 'buffer' involves unsafe type 'UnsafeBufferPointer<Int>'
   |       |- warning: expression uses unsafe constructs but is not marked with 'unsafe'
   |       `- note: reference to global function 'c_library_sum_function' involves unsafe type 'UnsafePointer<Int>'
16 |     }

It's a similar story in Xcode (where I've clicked on the yellow triangle to get details):

In all of these cases, you get a Fix-It to add unsafe in the right place. (It's the same logic as for try and await).

It's a similar story with uses of unsafe types in the signature of a declaration. We'll get one warning per declaration, with a Fix-It to add @unsafe to the declaration. The warning looks like this:

 2 | 
 3 | extension Array {
 4 |   func withUnsafeBufferPointerSimplified<T>(_ body: (UnsafeBufferPointer<Element>) -> T) -> T {
   |        |                                             `- note: reference to unsafe generic struct 'UnsafeBufferPointer'
   |        `- warning: instance method 'withUnsafeBufferPointerSimplified' has an interface that is not memory-safe; use '@unsafe' to indicate that its use is unsafe
 5 |     // ...
 6 |     fatalError("boom")

The net effect is that it should be possible to turn on strict checking, accept all Fix-Its, and you're done: all unsafety within your module is now explicit.

EDIT: Toolchains are available now:

geoffreygaren · January 10, 2025, 10:47pm

This is a fair point, but I ultimately agree with the jist of @fclout's argument:

I would go even further and say that inout and borrow usage of an unsafe pointer also have to be considered unsafe.

We can distinguish between two possible meanings for the keyword unsafe:

A dictionary-like definition: Memory corruption might happen here.

According to the dictionary-like definition, @Joe_Groff is right: Memory corruption can't happen here. It might happen elsewhere, but not here.

This option is the most parsimonious: Nothing about unsafe pointer needs to be unsafe -- only dereference.

FWIW, this appears to be the definition Rust uses, at least for raw pointers: "We can create raw pointers in safe code; we just can’t dereference raw pointers outside an unsafe block..."

I don't think we should copy Rust in this case, but at least that's an indication that some folks find that kind of programming workable.

A process-oriented definition: The programmer needs to verify an outside-the-language invariant here. Once the programmer verifies this invariant locally within the unsafe expression (assuming they've done so correctly), the rest of the program is memory safe globally (until the next unsafe expression -- rinse, repeat).

It's the process-oriented definition that requires unsafe here. We seek a process that invites the programmer and/or reviewer to read the local text of the program, in combination with surrounding documentation, and thereby verify an outside-the-language invariant.

The specific invariant we need to verify when we copy a pointer is shared ownership / lifetime: Do the copied-from pointer and the copied-to pointer agree on how they are going to share ownership of the pointee?

(And in the case of borrowing and inout, the invariant we need to verify is that the caller will keep the pointee alive for the duration of the call.)

This process-oriented definition has been working really well for us in the application of bounds and type safety to C++. We treat pointer arithmetic and pointer casting as an error at the point of arithmetic / cast -- not because that's the location where the memory corruption happens, but rather because that's the location where you have a chance to verify the invariant (with a runtime bounds check or type check).

In the alternative, the programmer has no hope: Here's a pointer. It has previously undergone an unknowable set of transfers between owners, increments and decrements, and casts. Now, is it still alive, correctly typed, and in bounds? Oy!

You could argue that programmers should know not to pass pointers around willy-nilly. But what will make them know? The unsafe keyword is how we, as language designers, share the know{ledge}.

This is a slightly different concern. It's 100% true that a pointer used in this sense is perfectly safe. The reason we need to consider it unsafe is a limitation of our type system. Our type system -- both UnsafePointer, and mmap's void* -- does not express "a pointer that I will never dereference", so we are forced to treat this as a pointer that I will dereference. You could imagine expanding the type system to represent "a pointer that I will never dereference", but I don't think the juice is worth the squeeze.

geoffreygaren · January 11, 2025, 1:37am

I guess I agree that unsafe conformance is sound in the most basic sense that memory corruption can't happen unless you've said unsafe somewhere in your program.

But I'm still concerned.

Consider a case like this:

struct FastCollection : /* @unsafe */ Collection { // Unsafe conformance
    // MEMORY SAFETY: Caller must ensure that 'position' is in bounds.
    /* @unsafe */ subscript(position: Int) -> Int {
        get {
            return uncheckedGet(position)
        }
    }
    /* ... */
}

and then a line of code like:

let collection: any Collection = /* @unsafe */ FastCollection()

We're right back where we started, with people writing code like

let pointer = array.withUnsafeMutableBufferPointer { return $0 }

The @unsafe here is performative only. This is not the part of the program that subscripts into the collection, so the programmer can't verify here that subscripting is in bounds.

It's exceptionally subtle what @unsafe even means here. At first glance, it would seem to mean "verify that this cast is valid". And it is.

FWIW, Rust does not allow an unsafe conformance to a safe function. So that's some evidence that developers can live with the restriction.

To clarify, I think it's fine for a closure to contain unsafe code. It's when a closure or function advertises that it is @unsafe that conversion to @safe should not be allowed.

geoffreygaren · January 11, 2025, 5:15pm

I'm starting to agree that the spelling

unsafe proc_name(pid, &buffer, buffer.count)

might be preferable to

proc_name(pid, unsafe &buffer, buffer.count)

or at least not worse.

The main thing that convinces me is that, in this example (and many others, I'm sure), it's really really ambiguous whether it's buffer that's unsafe, or the combination of buffer and buffer.count.

buffer.count is just an int, so as a type it's safe as can be. But buffer.count has an outside-the-language invariant: it must describe the length of buffer in the way the callee expects (including distinguishing "number of bytes" from "number of elements", and handling of null termination).

From this perspective, is it buffer, buffer.count, or proc_name that's unsafe? I don't know! And we can't reasonably expect the authors of (hundreds of thousands of) functions like proc_name to successfully navigate this subtle distinction.

Douglas_Gregor · January 11, 2025, 8:59pm

geoffreygaren:

a line of code like:
let collection: any Collection = /* @unsafe */ FastCollection()
We're right back where we started, with people writing code like

let pointer = array.withUnsafeMutableBufferPointer { return $0 }

The @unsafe here is performative only. This is not the part of the program that subscripts into the collection, so the programmer can't verify here that subscripting is in bounds.

I don't think the unsafe here is performative. It's indicating that you have introduced unsafety into your system at this point. I don't see this as materially different from referring to an unsafe type and having that convert to an Any, or be passed into a generic function. Those are also marked as unsafe at the point where you get the unsafe value/type, and the actual bad behavior from the unsafety might occur a whole lot later in generic code. I don't think we can prohibit any of these patterns without making the strict mode unusable for many people.

The compiler currently says this:

@unsafe conformance of 'FastCollection' to protocol 'Collection' involves unsafe code

Rust doesn't have collection traits, either, which is the crux of your example and a huge part of working with collections in Swift. UnsafeBufferPointer is a collection today, used fairly often, so I do not think it's reasonable for us to say that you simple cannot use that conformance in strictly-safe code.

Now, we could say that only unsafe types can conform to a protocol using an unsafe witness to a safe requirement. However, I don't think that addresses your concern, and it's less expressive than what's in the proposal.

Doug

fclout · January 12, 2025, 4:38am

I need to think about this more, but I wanted to throw in some early thoughts:

First, using unsafe at the point that the unsafe type is converted to a safe type makes sense to me. Unsafe is a reminder to the engineer that they are taking responsibility for the safety of the next operations and we need to ensure that the language requires an unsafe at least at the last operation where the engineer has a chance to do their checks.

Second, it is greatly awkward that an unsafe implementation doesn't just mean the implementation uses unsafe code, it might mean that some checks you would expect are missing. For instance, this function cannot lead to memory corruption when used with Array, but it can when used with UnsafeMutableBufferPointer:

func setItem8<C>(_ collection: inout C, value: C.Element)
	where C: MutableCollection, C.Index == Int
{
	collection[8] = value
}

It's a problem that you need to know the implementation of the generic function you are instantiating to know whether it will be safe when using an unsafe implementation of a protocol.

geoffreygaren · January 13, 2025, 12:51am

Upon reflection, I think the main difference between positive and negative examples of unsafe conformance / casting away unsafety is escapability.

This seems fine:

// SAFETY: The NoBoundChecks type does not check bounds, but that’s OK because firstIndex(of:) is a bounded loop
let index = unsafe noBoundChecks.firstIndex(of: x)

This meets our goal that the caller should verify invariants at the point of the unsafe keyword. The caller can locally verify that firstIndex(of:) is documented to iterate from .startIndex to .endIndex.

This case is real. Though it is very rare, we have occasionally needed to disable mechanical bounds checking inside a bounded loop in order to meet our performance goals in WebKit.

The C++ Safe Buffers programming model works this way too. Though subscripting is generally treated as unsafe, ranged for loops are considered safe — for any type.

In this spirit, I think @fclout's example of setItem8 might be fine? The caller is alerted that calling setItem8 with an unsafe type is unsafe, and the caller is in a reasonable position to know that setItem8 requires index 8 to be in bounds.

These seem sketchy:

// SAFETY: Who uses this property? I have no idea. Callers are promised a bounds checked value, but they don’t get one.
var collection: any Collection { get { return unsafe noBoundChecks } }

// SAFETY: Who manages the elements in ‘collection’? I have no idea. Callers are promised an element with automatic lifetime, but they don’t get one.
func addUseAfterFreeToYourCollection(collection: any Collection, index: Int, unsafeLifetime: UnsafeLifetime) -> Void { collection[index] = unsafe unsafeLifetime }

Because these wolf-in-sheeps-clothing unsafe types are casted to safe types and then sent elsewhere in time and space, there’s no way for us to verify any invariants locally.

You could imagine establishing an informal style rule that an unsafe cast of an unsafe conformance should not be allowed to escape, but the whole reason we’re implementing a strictly safe mode is that we know that humans are unable to enforce rules like that reliably.

Options

A. Is it possible to say that casting an unsafe type to a safe protocol conformance also suppresses Escapable conformance? That would be a formal way to enforce this observation. And functions like firstIndex(of:) don’t require escapable conformance.

B. If we’re forced to make unsafe conformance / casting away unsafety escapable, then I think the negative examples outweigh the positive. Some tie breakers in my mind include:

Programmers who want less strict verification of safety can just program in standard Swift. Meanwhile, programmers who want more strict verification of safety have no alternative. On balance, it is better to make strict verification of safety possible, even if it is sometimes annoying.
Anything that can be done with unsafe conformance can also be done with safe conformance that wraps unsafe conformance. You can make a wrapper type that verifies invariants before calling through to your unsafe type. And that leaves behind a textual record of the place where the operation must verify invariants. So no programming is impossible, but some programming is more verbose.
If you really want, you can just mark your conformance safe, and mark you init() unsafe. I think that’s a bad thing to do! Hopefully you’ll get an r-. But it’s not impossible, and if some people want to program this way, they can, even in strictly safe mode. It’s tempting to think that unsafe at the point of casting is more precise, and therefore more safe, than unsafe at the point of allocation, but I’m not sure it is. Casting and allocation have an important feature in common: the result value is escapable, and may travel anywhere in time and space. Their scopes are equally infinitely imprecise.
We can always add unsafe conformance at some later date. If we gather experience programming this way, and people are Having a Bad Time (TM), we can relax the restriction. Adding restriction later is much harder (as we’ve seen with thread safety restrictions).

fclout · January 13, 2025, 5:38pm

setItem8 is trivial, so I do imagine that people would know what it does. However, you could also imagine a function that parses some complex variable-length data structure out of a Collection<UInt8>, and is safe when instantiated with Data or [UInt8], but is dangerous when used with UnsafeBufferPointer<UInt8>. In this hypothetical, the implementer made a mistake that went undetected. With safe Collection implementations, this mistake does not lead to memory corruption: it leads to a fatalError in the unusual (but crucial) case that malicious, contrived data is provided. However, with UnsafeBufferPointer, it leads to out-of-bounds accesses.

This hypothetical function has the same escaping properties as setItem8.

Having given it more thought, IMO, the problem is that users of Collection assume implementations are (minimally) resilient against bad indices, and Unsafe{,Mutable}BufferPointer is not. I expect that we would run into the same kinds of problems for any implementation of a protocol where users make assumptions about memory safety that implementers take liberties against.

The emerging rule from this realization would be that a @unsafe symbol cannot satisfy a safe protocol requirement. You can't count on generic users knowing the specialized rules to uphold for your unsafe implementation. It's OK for an implementation to use unsafe code inside a safe protocol requirement as long as the unsafety doesn't propagate out.

A solution naturally falls out, as well: the subscript implementing the Collection requirement on Unsafe{,Mutable}BufferPointer should be bounds-checked. I think it's a well-known fact at this point that I would like the subscript to always be bounds-checked, but if we had a way to tell Swift that the Collection.subscript(_: Index) requirement is actually implemented by UnsafeBufferPointer.subscript(_checked: Index), that would be enough, too.

I think that this is an unusual situation and that, if there even are other unsafe conformances to safe protocols, they are rare enough that we could tell people to wrap their unsafe conformance into a safe wrapper. (We could do this for UnsafeBufferPointer too, really: the wrapper is Span.)

nnnnnnnn · January 13, 2025, 6:08pm

Douglas_Gregor:

Based on your comments here and additional motivation provided by @geoffreygaren and others, I've come around to agree with your position here. Subsequently, I've updated the proposal and implementation to introduce an unsafe expression:
unsafe <expression>

I get the purpose behind adding this syntax, but the specific unsafe keyword seems misleading. As an author/reader of this code, I read it as a statement about the safety of <expression>: "this code I'm executing is unsafe".

However, the meaning should be the opposite: "this code uses unsafe constructs, but I have made sure that I'm doing it safely". If that isn't the case, and there's an assumption about safety that should be thought about by the caller, then the outer function should be marked with @unsafe instead of the expression.

A different keyword would also better align with the throws/try and async/await pairs. The first of the pair describes the callee's behavior, and the second describes the caller's intent.

The alternative that jumps out at me is @unsafe/safely, but I don't like it very much. I'll try to think of others!

fclout · January 14, 2025, 5:26pm

I spoke with @geoffreygaren offline and this doesn't work:

The emerging rule from this realization would be that a @unsafe symbol cannot satisfy a safe protocol requirement.

because it means a @unsafe type cannot satisfy protocol requirements at all, and that is a problem. Giving the Collection implementation a bounds check still seems to do the right thing: if we don't do it, it's likely that we'd nag many users of UBP as a Collection to wrap it in something that bounds-checks.

Nobody1707 · January 14, 2025, 5:40pm

Adding bounds checking, while a definite improvement, won't make U(M)BP's Collection conformance safe, as its elements may be uninitialized.

Douglas_Gregor · January 14, 2025, 5:56pm

Right.

That might be a reasonable resting point: not introducing bounds checks everywhere for UnsafeBufferPointers, but performing the checks as part of the conformance to Collection conformance will improve things.

You are correct. It also won't provide lifetime safety for U(M)BP's.

However, I think there's an important qualitative difference here: you can inspect the code that's handling the U(M)BP to convince yourself that everything in it is initialized before passing it off as a Collection, and once you've done that there aren't any Collection operations that could cause initialization-related problems in the operation on Collection. Lifetime safety is similar: you can reason at the call site that the U(M)BP references memory that is exclusively yours, and will remain alive for the call, and it's unlikely that the operation on a Collection will violate that expectation.

With bounds safety, you can verify that the bounds are correct in the call to the operation on Collection, but you don't have a reasonable assumption that this is good enough: there are lots of collection operations that produce out-of-bounds indices (hello, endIndex), so normal bugs in the operation on a Collection could violate bounds safety when given a U(M)BP.

Doug

geoffreygaren · January 14, 2025, 5:58pm

Right. You also can't treat interfaces like prefix(while:) or for-in as safe because U(M)BP doesn't ensure lifetime / exclusive access during the closure callback / iteration.

dimi · January 19, 2025, 10:14am

I'd like to suggest that this change come with a docc improvement that automatically labels unsafe members with a warning box that encourages users to enable StrictMemorySafety for their module if they plan on using that type/function (much like closure-based callbacks got an async recommendation). This way, StrictMemorySafety can stay off by default, but almost anyone that looks up unsafe code will see the suggestion to enable these warnings/errors, leading to more adoption across the board.