SE-0458: Opt-in Strict Memory Safety Checking

fclout · February 10, 2025, 11:10pm

There is a non-hypothetical, stated intent that imported C structs are @safe without actually being safe. I think it's reasonable to discuss what design space that decision blocks off if we also agree as a requirement that at least one of the 4 lines of my snippet needs to be unsafe.

John_McCall · February 10, 2025, 11:49pm

Sure, I'm not trying to prevent any discussion, just trying to explore what you're saying. Parts of what you're saying don't make any sense to me, and I want to try to understand your point. Whether or not foo is a safe type, UMP<foo> is not. The reference-ness is core to the semantics of foo_delete and cannot be correctly annotated away, so the function either takes a managed pointer (in which case it's fine to be @safe) or an unmanaged pointer (in which case it must be @unsafe). The type properties of foo don't really change the design space at all.

fclout · February 11, 2025, 1:00am

Thanks, this helps. I'm going to change foo_delete to foo_resize because it shows the same problem and it removes the possibility of consuming semantics from the question. Now we have:

struct foo {
    int *bar;
};

void foo_init(struct foo *out) {
    out->bar = malloc(sizeof(int) * 200);
}

void foo_resize(struct foo *f) {
    f->bar = realloc(f->bar, sizeof(int) * 400);
}

This is what it translates to under the current proposal (off the top of my head):

struct foo {
	var bar: UnsafeMutablePointer<CInt>
	foo()
}
@unsafe func foo_init(_ out: UnsafeMutablePointer<foo>)
@unsafe func foo_resize(_ f: UnsafeMutablePointer<foo>)

In the 4-line snippet, we would have this:

var f = foo()
unsafe foo_init(&f)
var g = f
unsafe foo_resize(&g)
unsafe foo_resize(&f) // double-free!

This is a developer mistake, and their code does contain unsafe statements, so you can blame it on them. However, the actually-unsafe operation was var g = f. foo_resize makes the assumption that the foo has unique ownership of the bar pointer: the line where we broke that assumption was var g = f. There is no unsafe there because, under the current proposal, C structs that contain pointers are @safe implicitly. It's tempting to say it's fine because we do have unsafe somewhere else, but the problem is exacerbated if we imagine how Swift could start importing annotated functions as safe interfaces.

If we enumerate the assumptions that Swift makes about inout references, we can dress up a C pointer such that it satisfies all of them, and then Swift can expose a (safe) function that takes an inout reference instead of UMP:

// void foo_init(struct foo * /*annotations here*/ out);
func foo_init(_ out: inout foo)
// void foo_resize(struct foo * /*annotations here*/ f);
func foo_resize(_ f: inout foo)

This is a future direction, but there's desire to do that.

The concern is that the preconditions we have on foo_init and foo_resize are that struct foo has unique ownership of its resources, which is orthogonal to what is relevant to function arguments. As a result, we can have attributes that are correct on function arguments to get a safe projection and land ourselves in memory corruption regardless:

var f = foo()
foo_init(&f)
var g = f // clearly where the problem is
foo_resize(&g)
foo_resize(&f) // double-free!

This is annoying:

a C annotation turning a copyable struct into a move-only struct in Swift would be source-breaking, so even if it existed, it may not be possible to use;
the functions could live in a module that is separate from the one that defines the type, so it could be impossible to add the annotation in the first place;
even if we were able to use that annotation, the bug is that the attribute wasn't used, and it's always harder to identify omission bugs because they live in the negative space of code reviews.

IMO, this all happens because we are choosing @safe as a default for structures while everything else safe would be opt-in through annotations.

Douglas_Gregor · February 11, 2025, 1:15am

I don't think we should do this as a general rule. Swift is a language built on interoperability with C, and we should avoid categorically claiming that all of C is unsafe. If an external C function presents a safe interface (i.e., there are no unsafe types like UnsafePointer, or they've all been appropriately annotated to bring them in as something like a Span), we should consider those APIs safe.

This is something I'd considered for the Swift part as well: if the storage of a type involves an unsafe type or conformance, we could consider the type to be @unsafe. I've been a little nervous about the effects of this, because similar rules with Sendable have caused issues with recursive enums and classes.

The way the current proposal is written, C structs follow Swift rules in the sense that they are considered safe unless annotated otherwise. The main difference is that there is no "nudge" to get folks to annotate their C structs as unsafe or safe, because there's no equivalent to strict safety mode for the C header themselves.

Doug

Douglas_Gregor · February 11, 2025, 1:34am

Hi all,

During my continued rollout of this feature throughout the standard library, I've come across two things that might also be worth discussing: for..in loops, and whether we need to mark interfaces as @unsafe when it's implied by their type signatures.

For..in loops

For for..in loops, there is a question of where to put the unsafe keyword when the Sequence or IteratorProtocol conformances used for iteration are unsafe. If we were to follow the precedent of SE-0298, it would go after the for and before the iteration variable, e.g.,

for unsafe element in sequence { ... }

The unsafe is a little bit different from try or await, because it doesn't propagate out of the for loop... it's just acknowledging that we've reasoned about the unsafety introduced there.

The alternative is to put the unsafe annotation on the sequence itself, which covers any unsafety in producing the sequence and in iterating over it, like this:

for element in unsafe sequence { ... }

Marking clearly-unsafe things as `@unsafe`

The proposal effectively infers @unsafe for any declaration whose signature includes unsafe types or conformances, e.g., this will be treated as unsafe when used because UnsafePointer is @unsafe:

func withUnsafeBufferPointer(_ body: (UnsafeBufferPointer<Pointee>) -> Void) { ... }

As proposed, the strict safety mode will produce a warning due to the use of the unsafe type, and require that you either tag this as @unsafe or @safe to silence the warning. Having done this over a lot of code, I'd like to back off that a little bit: since we are already inferring that it is unsafe due to the safe types, there is no reason to annotate it as such, because that's just busywork. If instead you want to say that it's safe, like the count operation on UnsafeBufferPointer, then you can mark it as @safe: but that's the outlier, not the norm. Making this change means less annotation overall when enabling strict memory safety for a module, and also gives it a nicer flow:

A tool can automatically add unsafe to expressions and @unsafe to conformances that need it based on the presence of unsafe types in signatures.
One can mark specific operations as @safe, and tooling can help remove any now-unnecessary unsafe expressions or @unsafe conformances.

A nice thing about this is that (1) conservatively and mechanically gets into the strict safety model, and (2) is effectively removing false positives one-by-one so it's easy to reason about. (Thanks to @John_McCall for noticing the workflow advantages of this approach)

Doug

fclout · February 11, 2025, 7:18am

It appears to me that assuming C structs with pointers are @safe is inconsistent with assuming that C functions that accept pointers are @unsafe, and per longer message, that could become a bigger problem when Swift starts understanding bounds/lifetime attributes on pointer function parameters.

Douglas_Gregor:

The alternative is to put the unsafe annotation on the sequence itself, which covers any unsafety in producing the sequence and in iterating over it, like this:
for element in unsafe sequence { ... }

I'm not passionate about this, but my mental model is that the annotations for element should match what you have on IteratorProtocol.next() and the annotations for sequence should match what you have on Sequence.makeIterator(). If you get a safe iterator from an unsafe sequence, IMO, unsafe goes on the side of sequence.

Douglas_Gregor · February 11, 2025, 4:52pm

It's the same proposed rules as for Swift without strict memory safety enabled. Now, C doesn't have a "strict memory safety" model, and skews much more toward unsafe than any given Swift code, so we could choose to make the rules for imported C types different. If so, I do wonder if that should become its own separate proposal that provides these tighter rules along with the mechanisms for getting safe interfaces from C (lifetime and bounds annotations), because there's a lot of nuance there.

Doug

fclout · February 11, 2025, 6:06pm

Yes, I would say that the required prevalence of unsafe pointers in C is an important-enough distinction from Swift that we should use different rules. The Swift rules are relatively inconsequential because holding pointers in escapable structures is unusual. This is the norm for C, so getting it wrong is that much more likely to cause sadness.

I also agree that what's emerging here is that the rules for whether a C struct should be @safe are entangled with the rules for safe function projections, so we should discuss them together later. With that said, at the risk of being stubborn, I would still advise that we start more restrictively and ease off later, which would mean to assume that C structs containing pointers should be unsafe by default. As we bring up strict memory safety, we have an opportunity to introduce "source breaks" (warnings that you should use unsafe) into the dialect. I expect this window to shrink rapidly. If there's unsafe paperwork to file in the standard library as a result, I'm open to doing that adoption work.

AliMark71 · February 11, 2025, 9:41pm

+1 on the proposal.

I've generally wanted to see such a feature in swift for some time, and I'm happy to see it here, helping make swift code even safer.

just wanted to point out the question of what is the expected behavior of having an @unsafe main function?

consider the following sample code:

@main
enum Entery {
    @unsafe static func main() { /* Code... */ }
}

currently (as I've tested with 2025-02-06 snapshot toolchain), this will get a warning at the @main attribute about the use of unsafe constructs without marking it with unsafe.

compiler message:

 1 | @main 
   | |- warning: expression uses unsafe constructs but is not marked with 'unsafe'
   | `- note: reference to unsafe static method 'main()'
 2 | enum Entery {
 3 |     @unsafe static func main() { /* Code... */ }
 4 | }

now I'm not sure if this case goes under the more general discussion about the handling of @unsafe constructs by macros (but I think @main isn't a macro technically).

also just as a comparison, rust's solution to this problem is forbidding main from being unsafe.

Douglas_Gregor · February 12, 2025, 6:22am

I hadn't thought of the main function, thank you for bringing it up. My inclination is to have the compiler warn on the declaration of the unsafe main function, since there isn't going to be a user-written call to it ever.

Doug

Douglas_Gregor · February 12, 2025, 7:45am

Hi all,

The discussion here in this review thread has been very helpful, thank you! I've gone ahead and made another round of revisions to the proposal based on the feedback here, which I've gathered together in a pull request. The specific changes are:

Do not require declarations with unsafe types in their signature to be marked @unsafe; it is implied. They may be marked @safe to indicate that they are actually safe.
Add unsafe for iteration via the for..in syntax.
Add C(++) interoperability section that infers @unsafe for C types that involve pointers.
Document the unsafe conformances of the UnsafeBufferPointer family of types to the Collection protocol hierarchy.

The document itself has gotten a bit unwieldy, so I also did some restructuring for clarity that doesn't change the actual meaning beyond the above. Instead of a large, flat list of topics in Detailed Design, I've split out two subsections: one for sources of unsafety (@unsafe, the language constructs, @unsafe conformances, the standard library declarations that are marked @unsafe, etc.) and another for acknowledging unsafety (unsafe expression, unsafe for the for..in loop, using @safe and @unsafe to acknowledge safety by propagating it further, etc.). If you've already read it, it's not worth reading again for these changes, but it should make this document a better reference going forward.

Cheers,
Doug

SomeRandomiOSDev · February 12, 2025, 10:41pm

First off, big +1 for this proposal. One of the things that I love about the Swift language are these kinds of expression keywords (like try and await) that - at least from how I conceptualize it - serve as a callout to developers of the affect of an API call; having a keyword for calling out use of unsafe types/APIs seems like a natural direction for the language that should help cut down on the use and misuse of these APIs.

That being said, the only holdup that I have is the naming choice, at least for the @safe and @unsafe annotations. "Safe" is a broad term that can apply to a number of different concepts (e.g. memory safety, data race safety, thread safety, etc.) and I feel that using @safe and @unsafe could potentially limit future introduction of other safety checks.

As an example, let's say that at some point we'd want to apply this same concept to concurrency primitives that could lead to deadlocks. These annotations could/would be applied to either the unsafe types (e.g. pthread_mutex_t, os_unfair_lock, etc.) or the specific APIs that could deadlock (pthread_mutex_lock, os_unfair_lock_lock), or both.

When designing this annotation, @safe and @unsafe will already have a well defined meaning and therefore couldn't be used for our purpose, so we'd need to be more specific, perhaps @deadlock(Un)safe or @thread(Un)safe. While this would work, this would likely introduce confusion around the original @safe/@unsafe annotations, especially as compared to the new specific annotations, as the annotations are too broad to know what they mean without having to lookup their meaning.

If we want to consider future expansions for these types of opt-in safety checks we could rename the @safe/@unsafe annotations to something like @memorySafe/@memoryUnsafe which makes them explicit about the safety (or lack thereof) that their use conveys.

Alternatively we could keep these annotations named as is but allow them to accept a parameter that would specify what kind of safety they convey:

@unsafe(memory)
public struct UnsafeBufferPointer<Element> { ... }

...

// Imported C interface
@unsafe(deadlock)
public struct pthread_mutex_t { ... }

The annotation could accept multiple parameters to specify multiple kinds of unsafety:

// Example omits `unsafe` expression keyword where compiler would warn about it for the sake of readability
@unsafe(memory, deadlock)
public final class UnsafePointerProcessor {

    let lock: pthread_mutex_t
    let process: (UnsafeRawPointer) -> Void

    init(processor: @escaping (UnsafeRawPointer) -> Void) { ... }

    func process(_ pointer: UnsafeRawPointer) {
        // Only process one pointer at a time. Why? Because I'm pressed for an example of both memory and deadlock unsafety..
        pthread_mutex_lock(self.lock)
        defer { pthread_mutex_lock(self.lock) }

        // Could introduce a deadlock if the provided closure loops back and calls `process(_:)` on this instance.
        self.process(pointer)
    }
}

The one aspect that I don't have a great answer to is the unsafe expression keyword and how it would be applied to the usage of types/APIs with multiple types of unsafety. Would you have to use multiple keywords to explicitly list out all of the unsafe kinds that you're suppressing (memoryUnsafe deadlockUnsafe self.processor.process(ptr))? Would a single unsafe suppress all safety checks (seems like a bad idea)? Perhaps an additional compiler warning when unsafe is used for types/APIs with multiple safety checks to the tune of "warning: "memory" and "deadlock" safety checks were suppressed from a single expression, add parentheses to mark as intended"?

All in all, I'm giving this proposal a strong +1 and will be enabling as soon as its available in a Swift release, but I just keep thinking about these kinds of future expansions and don't want it to be difficult if this is a direction we'll eventually want to take. Food for thought.

grynspan · February 13, 2025, 12:54am

Should we? We really don't know if a given C interface is safe or not. Is ioctl() safe? It deals in integers only. How about raise(), which raises a signal—something anathaemic to Swift. Is close() any safer than fclose()? And there's no way fork() is safe to use without extreme caution!

My point being that the presence of unsafe types in a C function signature is a very weak heuristic, but also that we're making subjective claims about C functions here that may lull Swift users into a sense of security: after all, if Swift imports fork() as @safe, then it must be safe to use, right?

fclout · February 13, 2025, 4:34am

It's important to decide what unsafe means as much as what unsafe doesn't mean. In particular, right now, the goal of unsafe is that you can make a strong claim that your module is not introducing memory unsafety; the idea being that if enough modules have a strong claim that they are not introducing memory unsafety, that problem is essentially solved. This is extremely relevant to the majority of attacks that we see.

From other memoy-safe ecosystems, we already have ideas for what attackers might trend towards once we fix our memory safety problems: command injection attacks, file path shenanigans, deserialization bugs, design flaws, etc. Several of these are also levied against our platforms. I think that if you asked me which is more likely to cause security bugs between ioctl and string interpolation, I would probably say string interpolation.

This is all to say that modeling unsafety as "anything that Swift understands poorly", in my opinion, ends up being more about purity than security results. Attackers love memory corruption bugs because they allow them to enter "God mode" and achieve arbitrary capabilities. For instance, in the Triangulation exploit, attackers used a memory safety bug in font rendering to eventually start a JavaScript interpreter. It is undeniable that fork() and raise() can cause you problems. Can they start a JavaScript interpreter behind your back? As far as I know, the evidentiary record on the matter is sparse.

There's a tradeoff between purity and the engineering effort created for users of the strictly memory-safe mode and the people they depend on. At this time, my opinion is that there isn't much to gain by assuming that functions that don't take pointers are unsafe in the memory safety sense. I think that my opinion could change if we were able to find several examples of memory safety bugs caused by C library functions that don't use pointers.

Xazax-hun · February 18, 2025, 11:45am

The current implementation considers function pointers safe. I think the reasoning is that we cannot have temporal lifetime errors with function pointers as the pointees are alive for the whole duration of the program. However, function pointers can be null. I understand that we usually do not consider null dereference as a security issues as that should trap on most modern platforms but this is not necessarily the case for all platforms, like some embedded systems. This makes me wonder whether we should have different rules for embedded Swift vs regular Swift. Or is this an acceptable hole in some of the embedded cases?

Xazax-hun · February 18, 2025, 11:51am

I played around with the latest implementation on main and have a couple of question around compatibility between the strict memory safe and regular language modes.

Consider the following code snippet:

@unsafe
func g() {}

func f() {
    unsafe g()
}

This compiles fine under the -enable-experimental-feature AllowUnsafeAttribute -enable-experimental-feature WarnUnsafe flags. Unfortunately, when -enable-experimental-feature WarnUnsafe is not passed to the compiler I get some errors:

test.swift:5:11: error: consecutive statements on a line must be separated by ';'
3 |
4 | func f() {
5 |     unsafe g()
  |           `- error: consecutive statements on a line must be separated by ';'
6 | }
7 |

test.swift:5:5: error: cannot find 'unsafe' in scope
3 |
4 | func f() {
5 |     unsafe g()
  |     `- error: cannot find 'unsafe' in scope
6 | }
7 |

Similarly, without passing -enable-experimental-feature AllowUnsafeAttribute to the compiler, I get:

test.swift:1:1: error: attribute requires '-enable-experimental-feature AllowUnsafeAttribute'
1 | @unsafe
  | `- error: attribute requires '-enable-experimental-feature AllowUnsafeAttribute'
2 | func g() {}
3 |

I think this is unfortunate, as this makes it hard to write code that compiles warning and error free in both language modes. Do we have a plan to address this? I think the most convenient would be to accept both unsafe and @unsafe in both language modes. Moreover, I wonder if it would be a good idea to start to do that early while we are using the experimental flags. This would make it easier for projects to experiment without running into errors when these flags are not present. And this would help dogfood compatibility between the language modes.

John_McCall · February 18, 2025, 5:33pm

I believe the proposed design is that unsafe is accepted in all language modes. What you’re seeing is just an artifact of it being implemented as an experimental feature on the main branch.

John_McCall · February 26, 2025, 12:40am

SE-0458 has been accepted as revised during the review. Please take any further discussion to the announcement thread.

John McCall
Review Manager

SE-0458: Opt-in Strict Memory Safety Checking

For..in loops

Marking clearly-unsafe things as @unsafe

Marking clearly-unsafe things as `@unsafe`