Mach Port API

Hello everyone,

Here is a proposal for a new interface to Mach Ports that I've been working on. If you'd like to dig in on the implementation details, there's a PR open on the Swift System repo.

Swift Mach Port Interface

Introduction

Mach ports are an arcane technology that is difficult to wield safely.
However, as an integral component of our operating system, they occasionally require handling.

This proposal makes extenisve use of mach port terminology.
If you'd like to review the basics, please check out Ports, Port Rights, Port Sets, and Port Namespaces.

Motivation

Mach ports are difficult to get right, due mostly to how mach port rights are managed. Programmers are expected to track types, lifecycles, and other state in their heads.

Swift's advanced type system, recently augmented with move-only types, provides a new opportunity to create a Mach port interface able to prevent entire classes of bugs at compile time.

Proposed solution

  • Establish distinct types to represent recieve, send, and send-once rights.
  • Provide automatic lifecycle management of Mach port rights, which are unlike normal OOP objects.

Detailed design

Mach.Port<T> is the managed equivalent of a mach_port_name_t.
Mach port names can be moved in and out of instances as needed using init and relinquish.

func doStuffWithSendRightThenReturnIt(name:mach_port_name_t) -> mach_port_name_t {
	let send = Mach.Port<Mach.SendRight>(name)
	/* ... */
	return once.relinquish()
}

Types

There are three types of Mach port right: receive, send, and send-once. In the C interface, the type of right being manipulated is not known by the compiler. In this Swift interface, the types explicitly declare which rights can be created from what other rights, and in what way, eliminating KERN_INVALID_RIGHT runtime failures.

Allocation

Send and receive (but not send-once) port names are coalesced. In this interface, the caller is no longer involved in specifying the destination name when creating new rights. So, the KERN_RIGHT_EXISTS runtime error is not possible.

Automatic Deallocation

All valid (in the Mach sense) port rights must be deallocated exactly once, including dead names. So, rights are deallocated when the object is deinited, unless a right is relinquished, in which case ownership is transferred out of the object and automatic deallocation at destruction is disabled. For receive ports this is a mod refs -1, but the intent is the same.

There is very little functional difference between MACH_PORT_DEAD and a valid dead name; MACH_PORT_DEAD simply means the port died before entering the task that received it. However, MACH_PORT_DEAD does not represent a right that requires deallocation. So, for convenience, constructing rights with MACH_PORT_DEAD is allowed, but automatic deallocation won't happen.

Limits

The ipc space (the kernel-side storage of a proc's Mach port state) should be large enough to fit any reasonable workload. So, any time there is not enough ipc space to create a new right (KERN_NO_SPACE), the process will abort with a message indicating a possible bug. In the swift interface this is only possible when creating recieve or send-once rights.

Similar to ipc space, the uref field should be wide enough to fit any reasonable workload. So, in cases where a uref would overflow (KERN_UREFS_OVERFLOW), the process is aborted with a message indicating a possible bug.

Swift Interface

#if $MoveOnly && (os(macOS) || os(iOS) || os(watchOS) || os(tvOS))

import Darwin.Mach

protocol MachPortRight {}

enum Mach {
    @_moveOnly
    struct Port<RightType:MachPortRight> {
        /// Transfer ownership of an existing unmanaged Mach port right into a
        /// Mach.Port by name.
        ///
        /// This initializer aborts if name is MACH_PORT_NULL.
        ///
        /// If the type of the right does not match the type T of Mach.Port<T>
        /// being constructed, behavior is undefined.
        ///
        /// The underlying port right will be automatically deallocated at the
        /// end of the Mach.Port instance's lifetime.
        ///
        /// This initializer makes a syscall to guard the right.
        init(name: mach_port_name_t)

        /// Borrow access to the port name in a block that can perform
        /// non-consuming operations.
        ///
        /// Take care when using this function; many operations consume rights,
        /// and send-once rights are easily consumed.
        ///
        /// If the right is consumed, behavior is undefined.
        ///
        /// The body block may optionally return something, which will then be
        /// returned to the caller of withBorrowedName.
        func withBorrowedName<ReturnType>(body: (mach_port_name_t) -> ReturnType) -> ReturnType
    }

    /// Possible errors that can be thrown by Mach.Port operations.
    enum PortRightError : Error {
        /// Returned when an operation cannot be completed, because the Mach
        /// port right has become a dead name. This is caused by deallocation of the
        /// receive right on the other end.
        case deadName
    }

    /// The MachPortRight type used to manage a receive right.
    struct ReceiveRight : MachPortRight {}

    /// The MachPortRight type used to manage a send right.
    struct SendRight : MachPortRight {}

    /// The MachPortRight type used to manage a send-once right.
    ///
    /// Send-once rights are the most restrictive type of Mach port rights.
    /// They cannot create other rights, and are consumed upon use.
    ///
    /// Upon destruction a send-once notification will be sent to the
    /// receiving end.
    struct SendOnceRight : MachPortRight {}

    /// Create a connected pair of rights, one receive, and one send.
    ///
    /// This function will abort if the rights could not be created.
    /// Callers may assert that valid rights are always returned.
    static func allocatePortRightPair() -> (Mach.Port<Mach.ReceiveRight>, Mach.Port<Mach.SendRight>)
}

extension Mach.Port where RightType == Mach.ReceiveRight {
    /// Transfer ownership of an existing, unmanaged, but already guarded,
    /// Mach port right into a Mach.Port by name.
    ///
    /// This initializer aborts if name is MACH_PORT_NULL.
    ///
    /// If the type of the right does not match the type T of Mach.Port<T>
    /// being constructed, the behavior is undefined.
    ///
    /// The underlying port right will be automatically deallocated when
    /// the Mach.Port object is destroyed.
    init(name: mach_port_name_t, context: mach_port_context_t)

    /// Allocate a new Mach port with a receive right, creating a
    /// Mach.Port<Mach.ReceiveRight> to manage it.
    ///
    /// This initializer will abort if the right could not be created.
    /// Callers may assert that a valid right is always returned.
    init()

    /// Transfer ownership of the underlying port right to the caller.
    ///
    /// Returns a tuple containing the Mach port name representing the right,
    /// and the context value used to guard the right.
    ///
    /// This operation liberates the right from management by the Mach.Port,
    /// and the underlying right will no longer be automatically deallocated.
    ///
    /// After this function completes, the Mach.Port is destroyed and no longer
    /// usable.
    __consuming func relinquish() -> (mach_port_name_t, mach_port_context_t)

    /// Remove guard and transfer ownership of the underlying port right to
    /// the caller.
    ///
    /// Returns the Mach port name representing the right.
    ///
    /// This operation liberates the right from management by the Mach.Port,
    /// and the underlying right will no longer be automatically deallocated.
    ///
    /// After this function completes, the Mach.Port is destroyed and no longer
    /// usable.
    ///
    /// This function makes a syscall to remove the guard from
    /// Mach.ReceiveRights. Use relinquish() to avoid the syscall and extract
    /// the context value along with the port name.
    __consuming func unguardAndRelinquish() -> mach_port_name_t

    /// Borrow access to the port name in a block that can perform
    /// non-consuming operations.
    ///
    /// Take care when using this function; many operations consume rights.
    ///
    /// If the right is consumed, behavior is undefined.
    ///
    /// The body block may optionally return something, which will then be
    /// returned to the caller of withBorrowedName.
    func withBorrowedName<ReturnType>(body: (mach_port_name_t, mach_port_context_t) -> ReturnType) -> ReturnType

    /// Create a send-once right for a given receive right.
    ///
    /// This does not affect the makeSendCount of the receive right.
    ///
    /// This function will abort if the right could not be created.
    /// Callers may assert that a valid right is always returned.
    func makeSendOnceRight() -> Mach.Port<Mach.SendOnceRight>

    /// Create a send right for a given receive right.
    ///
    /// This increments the makeSendCount of the receive right.
    ///
    /// This function will abort if the right could not be created.
    /// Callers may assert that a valid right is always returned.
    func makeSendRight() -> Mach.Port<Mach.SendRight>

    /// Access the make-send count.
    ///
    /// Each get/set of this property makes a syscall.
    var makeSendCount : mach_port_mscount_t { get set }
}

extension Mach.Port where RightType == Mach.SendRight {
    /// Transfer ownership of the underlying port right to the caller.
    ///
    /// Returns the Mach port name representing the right.
    ///
    /// This operation liberates the right from management by the Mach.Port,
    /// and the underlying right will no longer be automatically deallocated.
    ///
    /// After this function completes, the Mach.Port is destroyed and no longer
    /// usable.
    __consuming func relinquish() -> mach_port_name_t

    /// Create another send right from a given send right.
    ///
    /// This does not affect the makeSendCount of the receive right.
    ///
    /// If the send right being copied has become a dead name, meaning the
    /// receiving side has been deallocated, then copySendRight() will throw
    /// a Mach.PortRightError.deadName error.
    func copySendRight() throws -> Mach.Port<Mach.SendRight>
}


extension Mach.Port where RightType == Mach.SendOnceRight {
    /// Transfer ownership of the underlying port right to the caller.
    ///
    /// Returns the Mach port name representing the right.
    ///
    /// This operation liberates the right from management by the Mach.Port,
    /// and the underlying right will no longer be automatically deallocated.
    ///
    /// After this function completes, the Mach.Port is destroyed and no longer
    /// usable.
    __consuming func relinquish() -> mach_port_name_t
}

#endif

Alternatives considered

Initially I tried having the port rights be RawRepresentable<mach_port_name_t>, which encourages passing instances to functions that will implicitly cast to mach_port_name_t. Since these APIs often consume the right, this prevents the interface from being able to automatically manage the right's lifecycle.

11 Likes

There are two existing precedents that IMO make better naming candidates than relinquish:

  • It looks like this pitch is dependent on move semantics. Since consuming (spelled __consuming in the proposed interface) is how the compiler describes the semantics of the method, why not just name the method func consume()?
  • Unmanaged<T> has the {take,pass}{Retained,Unretained} methods. None of these mutate the Unmanaged<T>. take is also used in move semantics to describe an operation that consumes a binding and returns the value held by it, so func take() seems a better candidate than func pass().

I’m also not sure whether the use of generics to model permissions has much precedent. It feels very C++y to me. Why not struct Mach.SendRight, struct Mach.SendOnceRight, and struct Mach.ReceiveRight?

I'd be wary of naming a function consume if we ultimately settle on consume as an ownership operator; and likewise take if instead we settle on take as an operator. There doesn't seem to be a need to be clever here, IMO: transferOwnership seems like it'd be plenty clear.

2 Likes

This is really neat, but I’m not sure why you’re proposing it for System. I know System is “multiplatform not cross-platform”, but that doesn’t mean it has to contain every low-level API on every platform. Why isn’t this import Mach on Apple platforms only?

3 Likes

Why is that better? Seems like this generates a large number of fiddly platform-specific modules instead of consolidating system-level APIs for each platform under one package.

Since cross-platform support is a non-goal for Swift System, it is essentially by definition a grouping of platform-specific APIs. Where two platforms have similar-looking APIs with the same name, they use ad-hoc polymorphism to make them source-compatible. But this only applies to a subset.

1 Like

That sounds better to me: then the experts on each platform can design their own APIs without having to harmonize with others*, the reviewers for each module can be different, the documentation won’t have a bunch of irrelevant sections for users to scroll past or search through, and the implementations don’t need to be littered with platform conditionals. And cf Foundation’s recent announcement, even on one platform having to compile a monolith of an API and hope dead code stripping only gives you the parts you want isn’t great for code size (or build time).

* I know this is a little dubious; normally I’d value consistency here! But these are low-level APIs, and fidelity is probably more important than consistency in this case.

EDIT: Pre-emptive response to “but isn’t that what import Darwin does?” I wasn’t quite at Apple early enough for the decision to make all of Darwin one module, but I’m pretty sure it was “just” to not have to worry about the mass of non-modular headers inherited from decades-old BSDs, and to focus on Apple’s preferred APIs in frameworks. New API suites in /usr/include bear this out as they have often gotten their own module, like os.log. (Okay, this undercuts my point a little because submodules are not treated as totally distinct.)

4 Likes

Another benefit to keeping these APIs in separate modules is cross-import overlays. If I want to add some extensions to accept Mach types, I would much prefer to only depend on the Mach module than all of Swift System.

Swift System could still reexport the Mach module, no?

2 Likes

^ swift-collections is a great model to copy, IMO. Each collection type is essentially its own package with the umbrella Collections module re-exporting everything.

Given that swift-system is meant to be a low-level interface, giving the user the option to choose just the parts they need would make the most sense for the intended use cases, and if someone wants to just import System, they can do that too.

6 Likes

I have a question about the various initializers, particularly this sentence which occurs multiple times:

        /// If the type of the right does not match the type T of Mach.Port<T>
        /// being constructed, behavior is undefined.

This simply states that if I'm holding it wrong, I can get undefined behavior. Can't we throw an error instead? Was that seen as too expensive? I would prefer to simply not have the possibility of undefined behavior, even at the cost of a little time.

This simply states that if I'm holding it wrong, I can get undefined behavior. Can't we throw an error instead? Was that seen as too expensive? I would prefer to simply not have the possibility of undefined behavior, even at the cost of a little time.

There are two reasons. The first is indeed perf cost. It requires a syscall to check what right types are held for a given port, which we want to avoid if you are doing the right thing. The second is that we can't actually know for sure. Receive rights and send rights are coalesced. So, for example, if you hold 1 receive right and 3 send rights for a port, they'll all have the same mach_port_name_t value, and we won't be able to tell if the one you've used is one of those, let alone which one of those it might be.

We have talked about maybe having an environment variable or something similar that can be flipped on to put this interface into a pedantic mode that will do expensive checks like this.

1 Like

Could we have a safe-by-default throwing initializer coupled with an unchecked one?

extension Mach.Port {
  public init(name: mach_port_name_t) throws

  public init(unchecked name: mach_port_name_t)
}

In this way, potential unsafety has something to mark it.

Code that comes in direct contact with mach ports is more likely to be sensitive to performance. So, I think we want the default path to be the fast one (it's already way safer than holding plain old mach_port_name_ts).

I do see some value in the ability to express that certain code may be okay with paying the cost for extra checks every time. Is there any precedence for doing the inverse? i.e. having the bare name be unchecked by default, along with an init(checked name: mach_port_name_t).

I don't like going more-unsafe by default. The bounds-checking mode in Unsafe*BufferPointer is perhaps the closest, where it checks and traps in debug mode, then does not check in release. I would find this more palatable than what you suggest, but note that we are considering switching this to always checking, with an explicit unchecked call alongside to lower safety in exchange for performance. The goal is to make a safe API that is the easiest to reach for, and also provide a less-safe API for when there is a performance need.

6 Likes

I don't think "low-level" implies performance, especially when talking abut the Mach Port API. The Mach Port API is mostly about IPC, and IPC is rarely "performance sensitive", as you depend on the other processes performances and OS scheduling anyway.

Defaulting to the unsafe initialiser is premature optimisation IMHO.

Mach IPC is involved in plenty of performance-sensitive codepaths.

2 Likes

Thinking more about it, yes Mach messages are use in performance sensitive paths (they are use so pervasively on Apple OSes, that they must be use in performance sensitive paths like event handling, XPC, video decoding and playback, …), but this is mostly messages passing.

Mach Port allocation should not be perform in hot code path, and so, having Mach.Port.init being safer at the cost of a small performance hit should not be an issue, especially as there is an unsafe variant if you want to trade off safety for performance.

This behaviour would better match other Switch API, which are safe by default, but propose unsafe variants for case where checks are considered too costely.

1 Like

I find this discussion confusing - is this actual undefined behaviour, as in UB? Or is it merely unspecified behaviour?

I would have assumed the latter, but the terminology has now switched to "safety" (which in Swift, means UB), and now we're drawing analogies to bounds-checking, which exists to prevent UB.

That’s a good question. I took

…to mean undefined behavior.

Using a Mach API with a wrong right is perfectly define at the Mach API level, and just results in the call returning something like KERN_INVALID_NAME, KERN_INVALID_RIGHT, or other related error code.

The fact that it is considered undefined probably just means the Swift API make strong assumption about port right correctness and will misbehave if the assumption is broken.

2 Likes

I think a lot of us have familiarity with the specific way “undefined behavior” is used in the C spec. The C spec says Undefined Behavior can’t happen in a well-formed program, and modern optimizers like to rely heavily on this fact to do things like eliminate entire swaths of code because there’s a chance an addition somewhere within might overflow.