SE-0390: @noncopyable structs and enums

I appreciate it's not easy.

That I can't quite follow. Why wouldn't I be able to do something like this?

@noncopyable
struct FD {
    var underlying: CInt

    consuming func startCloseHopingItWillFinishSoon() {
        defer { forget self }
        fireAndForgetUnregisterAndClose(...)
    }

    consuming func close() async throws {
        defer { forget self }
        try await unregisterAndClose(...)
    }
}

actor ManagedFD {
    let fd: FD? // yeah, I know, currently no Generic<@noncopyable>, could use -1 or some other sentinel for now.

    static func withManagedFD<R>(_ body: (ManagedFD) async throws -> R) -> R {
        let managedFD = ManagedFD(...)
        defer {
            try! await managedFD.close() // yeah I know, currently I need to spell that out...
        }
        return try await body(managedFD)
    }

    func close() async throws {
        var fd: FD? = nil
        swap(&fd, &self.fd)
        try await fd.close()
    }

    deinit {
        if self.fd != nil {
            if strictMode { error("ManagedFD died without all resources deconstructed, tight file descriptor control lost for a while") }
            var fd: FD? = nil
            swap(&fd, &self.fd)
            fd?.startCloseHopingItWillFinishSoon()
        }
    }
}

Right, yes. You can always push responsibility one layer up. If you have file descriptors you can define that there is a FileDescriptorManager which owns all the file descriptors. And FileDescriptor's deinit merely asks its manager to do the destruction when convenient.

That however has at least two profound consequences:

  1. At a given point in our program we can no longer know the exact state of our resources. As in, an underlying file descriptor may or may not be closed when my type got deinited.
  2. At some point we reached the top level, who manages that? Apple's SDKs often use global singletons (Dispatch's thread pools, Swift Concurrency's cooperative pool, the main thread/queue/runloop, URLSession's connection pools, ...) and Rust's Tokio/mio just blocks the calling thread in Drop for Runtime and the shutdown* methods until everything's done.

Much like myself, Tokio/mio for example don't seem to be happy with non-deterministic file descriptor closes (the "just push it up one layer to a system that'll close at its convenience"). But much like deinit, Drop requires a synchronous operation too and at the same time it needs to deregister & close that fd. So how can that work? It seems that each runtime (even if multi-threaded) actually shares just one kqueue() that just gets dup'd to allow multi-threading without having to take a mutex in userland to access the selector. That way, Tokio/mio uses the kernel as the "mutex" and all registrations/deregistrations of file descriptors are shared across all threads which means that any Tokio worker thread can deregister a file descriptor using it's clone of the one kqueue().
The repercussions here are that the kernel now has to synchronise this one shared kqueue() which likely leads to sub-linear scaling across cores. Also, mio has to issue more syscalls and ignore errors because it "cannot know" what the actual registration is (as they can be changes concurrently by multiple threads). It just deletes all possible registrations and hopes that the underlying system is okay with that.

And more so, it will assume that you're running on a Runtime thread. If not, you're bust. For Rust, I think this checks out because for async fns you likely pick one Runtime and fly with it, so async fns aren't run on some other runtime or the "wrong thread".

So overall, I'm not fully convinced that the tradeoffs that Tokio/mio made will just carry over to Swift. Swift Concurrency doesn't have an I/O capable runtime that everything can just use so it doesn't really seem feasible to require synchronous resource destruction like Rust does. And even in Rust, this comes with profound consequences (one shared kqueue) and also requirements on the underlying kernel system (kqueue, epoll, io_uring, ...).

Summing up, I don't want to say anybody is wrong here but I don't think the argument "Rust can't do it, we won't need it" fully works. In Rust, "end-user types" are what Swift will call @noncopyable and they usually guarantee resource destruction by the time they got dropped. This works (but as we've seen comes with real implementation restrictions)! In Swift, I'd argue that most "end-user types" will not be @noncopyable so resource management will remain really hard unless you withFunction everything or require the user to manually call close() async throws which is super easy to forget.

4 Likes

I can think of several cases where having linear types would be very helpful in Swift.

  • Linear types would enable the author of a type to ensure that instances of a type are always destroyed on the main actor. This is especially helpful when working with UIKit, where almost everything must be called on the main actor.
@MainActor
@noncopyable
struct OrientationObserver {
    let cancellable: AnyCancellable
    
    public init(
        onOrientationChanged orientationChanged: @escaping (_ newOrientation: UIDeviceOrientation) -> Void
    ) {
        cancellable = NotificationCenter.default
            .publisher(for: UIDevice.orientationDidChangeNotification)
            .sink { _ in
                orientationChanged(UIDevice.current.orientation)
            }
        UIDevice.current.beginGeneratingDeviceOrientationNotifications()
    }
    
    public consuming func cancel() {
        UIDevice.current.endGeneratingDeviceOrientationNotifications()
    }
    
    fileprivate deinit {}
}

Plenty of people in these threads have stated that they would find the ability to ensure destruction on the main actor useful:

  • Linear types could be used to implement a safe continuation type without runtime checks:
public func withContinuation<T>(_ fn: (consume Continuation<T, Never>) -> Void) async -> T {
    return await withUnsafeContinuation {
        fn(Continuation($0))
    }
}

public func withThrowingContinuation<T>(_ fn: (consume Continuation<T, Error>) -> Void) async throws -> T {
    return try await withUnsafeThrowingContinuation {
        fn(Continuation($0))
    }
}

@noncopyable
public struct Continuation<T, E: Error> {
    let unsafeContinuation: UnsafeContinuation<T, E>
    
    fileprivate init(_ unsafeContinuation: UnsafeContinuation<T, E>) {
        self.unsafeContinuation = unsafeContinuation
    }
    
    public consuming func resume() where T == Void {
        unsafeContinuation.resume()
    }
    
    public consuming func resume(returning value: T) where E == Never {
        unsafeContinuation.resume(returning: value)
    }
    
    public consuming func resume(returning value: T) {
        unsafeContinuation.resume(returning: value)
    }
    
    public consuming func resume(throwing error: E) {
        unsafeContinuation.resume(throwing: error)
    }
    
    public consuming func resume(with result: Result<T, E>) {
        unsafeContinuation.resume(with: result)
    }
    
    public consuming func resume<Er>(with result: Result<T, Er>) where E == Error, Er : Error {
        unsafeContinuation.resume(with: result)
    }
    
    fileprivate deinit {}
}
  • Linear types would be helpful for actors that would like to run isolated code upon destruction
  • Linear types would be helpful for types that would like to run asynchronous code upon destruction
  • Linear types would be helpful for types where the destructor can fail/throw
  • Linear types would be helpful for types where destruction can be expensive/blocking and thus want to make destruction explicit

Linear types aren't "necessary" — you can make working code without them — but Swift is full of helpful tools that aren't "necessary", but help ensure your code is correct (automatic reference counting, explicit nullability, enumerations, noncopyable types (soon), and even structures).

5 Likes

I'd like to experiment with non copyable types with SQLite statements. Those are objects that should not have several users - for example if the database rows fetched from a statement are iterated by two consumers, none will get the rows it expects. I currently put advice against statement sharing in the documentation: move-only types look like a good fit.

But I can't start experimenting. The linked toolchain macOS #583 won't run with the following dialog (developer not verified):

/Library/Developer/Toolchains/swift-PR-63783-583.xctoolchain/usr/bin/swift package build

And GRDB version 6.8.0 won't build
with the latest available toolchain swift-DEVELOPMENT-SNAPSHOT-2023-02-23-a.xctoolchain aka org.swift.57202302231a:

$ /Library/Developer/Toolchains/swift-DEVELOPMENT-SNAPSHOT-2023-02-23-a.xctoolchain/usr/bin/swift build
...
error: compile command failed due to signal 6 (use -v to see invocation)
Failed to reconstruct type for $sScs12ContinuationVyxq__GD
Original type:
(struct_type decl=_Concurrency.(file).AsyncThrowingStream.Continuation
  (parent=bound_generic_struct_type decl=_Concurrency.(file).AsyncThrowingStream
    (generic_type_param_type depth=0 index=0 decl=_Concurrency.(file).AsyncThrowingStream.Element)
    (generic_type_param_type depth=0 index=1 decl=_Concurrency.(file).AsyncThrowingStream.Failure)))
...
4.	While evaluating request IRGenRequest(IR Generation for file "/Users/groue/Documents/git/groue/GRDB.swift/GRDB/ValueObservation/SharedValueObservation.swift")
5.	While emitting IR SIL function "@$s4GRDB22SharedValueObservationC6values15bufferingPolicyAA05AsynccD0VyxGScs12ContinuationV09BufferingG0Oyxs5Error_p__G_tFfA_".
 for expression at [/Users/groue/Documents/git/groue/GRDB.swift/GRDB/ValueObservation/SharedValueObservation.swift:372:90 - line:372:91] RangeText="."
...

I think the toolchain should show up under Privacy & Security system settings pane for you to "allow anyway" after that happens. The blunt hammer of xattr -rc swift*.xctoolchain might do the trick if that doesn't.

I'll ask to see if that's a known issue. Thanks for giving it a try!

1 Like

You gave my the energy to try again. Right-click + Open on the .xctoolchain file was enough for swift build to ask me if I want to open anyway, and... build #583 successfully builds GRDB 6.8.0 :tada: OK I can start toying now :-)

:sweat_smile: I was ready to deal with MaybeStatement as a temp replacement for optionals (kudos for the Working around the generics restrictions section :+1:), but I don't quite know how to deal with sequences! Iterators return optionals, so iterators can't produce noncopyable types. All right, be tough, Gwendal, you'll crawl your way :muscle:

It's tough indeed.

A simplified version of the proposal FileDescriptor won't build when embedded as a new file in my package:

import Foundation

@_moveOnly
struct FileDescriptor {
  private var fd: Int32

  init(fd: Int32) { self.fd = fd }

  func write(buffer: Data) {
    print(buffer)
  }

  deinit {
    Darwin.close(fd)
  }
}
$ /Library/Developer/Toolchains/swift-PR-63783-583.xctoolchain/usr/bin/swift build
Building for debugging...
error: compile command failed due to signal 6 (use -v to see invocation)
Begin Error in Function: '$s4GRDB14FileDescriptorVfD'
Owned function parameter without life ending uses!
Value: %0 = argument of bb0 : $FileDescriptor            // users: %1, %3

End Error in Function: '$s4GRDB14FileDescriptorVfD'
Found ownership error?!
triggering standard assertion failure routine
UNREACHABLE executed at /Users/ec2-user/jenkins/workspace/swift-PR-toolchain-macos/branch-main/swift/lib/SIL/Verifier/LinearLifetimeCheckerPrivate.h:211!
[...]
---
4.	While evaluating request ExecuteSILPipelineRequest(Run pipelines { Non-Diagnostic Mandatory Optimizations, Serialization, Rest of Onone } on SIL for GRDB)
5.	While running pass #46 SILFunctionTransform "OwnershipModelEliminator" on SILFunction "@$s4GRDB14FileDescriptorVfD".
 for 'deinit' (at /Users/groue/Documents/git/groue/GRDB.swift/GRDB/Core/FileDescriptor.swift:13:3)
6.	Found verification error when verifying before lowering ownership. Please re-run with -sil-verify-all to identify the actual pass that introduced the verification error.
7.	While verifying SIL function "@$s4GRDB14FileDescriptorVfD".
 for 'deinit' (at /Users/groue/Documents/git/groue/GRDB.swift/GRDB/Core/FileDescriptor.swift:13:3)
[...]

I'm not sure how I can help. The review ends on March 7, but I'm not sure the compiler team will be able to ship a toolchain able to deal with GRDB by this date. That's fine, I'll play later with the result of the proposal.

I'm not against exploring must-explicitly-consume types as a new language feature, but it's definitely out of scope for this proposal. Going beyond "Rust can't do it", I get the impression from conversations with Rust developers that Rust tried to do it, and rejected the results. Gankra had posted some helpful links about the history here in the pitch thread:

One of the points she makes is that ensuring something is explicitly consumed is a hard thing to prove through any level of abstraction or indirection. Even Optional<T> becomes difficult to work with, since you get pushed to explicitly consume the value even on paths you know it will always be nil. And that's in a language where static lifetime values are relatively pervasive; as you noted, for most Swift code, the norm will likely remain to work with copyable and shared-ownership objects, making static enforcement even harder to maintain. As soon as a value ends up owned by a class, an escaping closure, global variable, or any other shared-ownership location, then the only place it can be consumed is in that containing object's destructor, which could again run at any time in a context you don't control, defeating many of the proposed use cases.

4 Likes

The current design of Sequence will definitely need some rethinking in how it will work with noncopyable types, since it is currently designed in such a way that the Iterator basically always has to own a copy of the Sequence, which is of course impossible when you can't copy something. The Collection model is likely to work a little bit better with a noncopyable type, since an index is always passed alongside the original collection, allowing you to implement a subscript that _reads or _modifys elements of the type in-place.

There's a bug we found with codegen of deinits in types that are otherwise "trivial". One way to work around this is to add an extra field of class type that doesn't do anything:

@_moveOnly
struct FileDescriptor {
  private var fd: Int32
  private let _workaround: AnyObject? = nil
}

Note also that some of our optimization passes may still lead to invalid codegen with noncopyable types, so it's best to stick to -Onone for testing purposes.

2 Likes

All right. If it's of any use, GRDB does not use sequences. It uses "cursors" - Cursor is like IteratorProtocol, but its next method can throw. There are cursors of database rows, but also cursors of statements built from an SQL string ("SELECT ...; SELECT ...; etc").

Usage:

while let value = try cursor.next() {
  // use value
}

I'll be greatly interested in the adaptations to sequences and iterators for move-only types, because I'll need to apply them to cursors as well.

2 Likes

You might have to wait until some of the other features we're working on, like borrow bindings, also come into place to fully integrate a move-only cursor with the language. Without them, the only way to really produce a lifetime-dependent value would be with a higher-order function, like:

extension Statement {
  borrowing func withCursor(_ body: (borrowing Cursor) -> ()) { ... }
}

extension Cursor {
  borrowing func forEachRow(_ body: (borrowing Row) throws -> ()) rethrows { ... }
}
1 Like

It sounds like there's a fix on the way for this issue. Toolchains from 02-23 on should have a fix for the deinit codegen issue you ran into earlier as well.

1 Like

I'll like to protect the current ergonomics on the library, that GRDB users appear to be fond of. I'm reluctant to break existing user code without a good reason. Some of the best patterns of the library are seven years old: that's something to take care of.

I only aim at API compatibility, not ABI. Ideally, I could replace the Statement class with a noncopyable Statement struct with only very few users noticing the change, and perhaps even zero if they're already good citizens and don't share statements, as documented. The lib is already in a good shape for this goal: users usually do not see Statement instances. And if they do see those instances, they are actively encouraged not to store them.

A typical usage of SQLite statement is when users aim at the best performance:

// Very close to the raw SQLite speed.
// Users who aim at sheer performance are really happy they can do this.
// Ideally, this would still compile once Statement is a move-only type.
let statement = try db.cachedStatement(sql: "INSERT INTO player (name, score) VALUES (?, ?)")
for player in players {
    try statement.execute(arguments: [player.name, player.score)
}

Instead of:

// Convenient, hidden raw statement,
// but frequently Codable-based,
// and always full of string-based accesses to column values.
// Can't achieve light speed.
for player in players {
    try player.insert(db)
}

Acute readers will notice let statement = try db.cachedStatement(...) in the above snippet. Yeah, SQLite statement compilation takes time (parsing + query plan), so caching can be important. I assume that the move-only Statement.deinit will be able to move back the raw sqlite3_statement* pointer back in the cache, after use. I said "assume", but I meant that I need this to be possible.

to add another data point if it helps, swift-mongodb has a type Mongo.Batches that is very similar to @gwendal.roue ’s cursor as he describes it. Mongo.Batches is an ARC type because it holds a reference to a Mongo.Connection, and Mongo.Batches is able to release that reference early when the user exhausts the database cursor, instead of waiting for user code to exit the iteration scope.

try await session.run(command: query, against: "my-database")
{
    for try await _:[Element] in $0
    {
    }
    // should be able to re-use the cursor’s connection here ...
}
// ... instead of waiting until here.

i don’t see this use case as being blocked on noncopyable, because i can achieve this behavior with ARC. but if we had @noncopyable, the Mongo.Batches type could become stack-allocated and non-refcounted, since there is never any reason to escape it from the iteration scope, it is only an ARC type right now to prevent lingering references to its wrapped Mongo.Connection.

2 Likes

These are great examples! Thank you so much for sharing them!

And in fact making your Statement type a noncopyable struct would help people who have been misusing it to find and correct those potential bugs.

Another interesting point about let statement = db.cachedStatement(...) is how it ties into some related work we're eyeing about supporting constrained lifetimes. In this case, it would be nice to be able to guarantee that statement could never outlive the db that provided it. (In practice, I don't think this is critical for your use case since db objects are typically very long-lived. A noncopyable form would discourage people from storing statements, which is likely sufficient.)

Yeah, a key goal of noncopyable is to eliminate runtime ARC for cases where an object naturally has a limited lifecycle.

Tim

6 Likes

Yes, and:

(In practice, I don't think this is critical for your use case since db objects are typically very long-lived. A noncopyable form would discourage people from storing statements, which is likely sufficient.)

Correct! No one ever has reported a crash due to a use of the unowned reference to the database connection held by the current Statement class. I agree that it's better when the mere possibility of programming errors can be discarded, but I may also keep the code simple and stick to YAGNI. We have years of experience of relying on api design as a the poor-man's borrow-checker.

As someone that inhabits userland I'm always somewhat reluctant to weigh in on evolution proposals, but a couple of thoughts.

The only reason I actually read the proposal was I happened to skim the initial responses and I had a visceral allergic reaction to reading that ?Copyable was proposed to mean maybe Copyable. I therefore read the full proposal and the full thread here.

I would like to strongly encourage the future direction to settle on syntax other than ?Copyable. The ? operator already has two extremely important uses in Swift, obviously in terms of optionals and also the ternary operator. While ?Copyable would only make my eyes bleed as an experienced user, more importantly I feel it has the potential to add unnecessary confusion for new learners of Swift. It may be hard for experienced programmers to recall that for new programmers, Optionals take quite a bit of work to get your head around. Perhaps for many people who contribute to Swift Evolution, they never experienced this problem. It remains real in userland. Any additional use of the ? character in syntax for an another unrelated language feature seems like unnecessary lack of clarity.

My immediate reaction was that ~Copyable could be a great syntax for this, and I then read above that @michelf suggested a similar thing. I would describe this slightly differently from Michel. In my view, this syntax should be considered to indicate something is "agnostic" to whether a type is Copyable. I'm aware this is in the future directions part of the proposal, but it was the part that mattered to me where it didn't feel quite right yet.

In terms of the overall proposal, it seems a valuable addition. I'd have thought the most natural fit would be noncopyable struct... without the @ in the same way that we use final class..., (it's a struct but is more tightly constrained in this one way, just like a final class is a class but it is constrained in that it can't be subclassed). But I read the arguments above for why @noncopyable makes more sense to the authors than noncopyable, and I accept they seem well formed.

From a broader perspective, this is a rather unusual situation where since every type has been implicitly conforming to an unstated Copyable, we're slightly painted into a corner—we can't just use the expected standard Swift form which would be to simply leave this declaration out when the type doesn't conform to it. OK, fair enough. As a broader point, it would be good if someone knowledgable is casting an eye around to think about if there are any other analogous situations that may need to be addressed in future, so that a pattern is settled on that will work for both.

10 Likes

The opposite is true here: Rust has a Clone trait defining an explicit clone mechanism ((&Self) -> Self) and a Copy trait inheriting from it that enables implicit copying.

Hence, if you wanted to provide a function on your @noncopyable type that creates an explicit copy, clone() would probably be a good choice based on prior art. In fact, all of the 4 terms you suggested to me read like explicit rather than explicit copies, e.g. “recreating” the value from scratch rather than just copying bits around.

From what I've read, "copy" in Rust means to create a new instance from the bits of another instance, while "clone" means to create a new instance equivalent to another instance with no constraints on how it's done. Therefore, "clone" in Rust is equivalent to "copy" in Swift, and "copy" in Rust is equivalent to "bitwise copy" in Swift (even though Rust does not allow implicit clones).

That being said, I can see how that statement was confusing.

Rust is really the outlier in terminology here. The idea of copying a value is common to every programming language. In most languages, of course, it’s an implementation-level concept; it’s only surfaced in the source language in “systems-y” languages that try to give the programmer control over basic representations. But it’s still a core concept that you sometimes need to create independent copies of a value, and that operation isn’t always going to be as simple as copying bits. “Copy” in this sense has a very long history, much older than Rust or probably any language still in major use.

That Rust chooses to tie the implicitness of the copy to its bitwise-ness makes perfect sense for its design goals. And I understand how they ended up picking the names here, because the result of that choice is that they don’t have a built-in notion of non-bitwise copying at all, other than the default memberwise derivation of the clone operation. But I do think they’ve done the broader PL world a bit of a disservice, in that people who’ve started with Rust sometimes struggle to realize that “copy” elsewhere is not inherently constrained the way it is in Rust.

8 Likes