"swiftified" Darwin / Posix API's

The recent errno thread got me to this idea:

Would there be an interest of having a "swiftified" version of Darwin / Posix API's? Among the features we could have:

  1. errno -> throw conversion.
    • no fear of ARC messing with errno.
    • error handling standardized via throws (to not deal with the differences of particular API conventions (error condition in Darwin API's could be signalled by many different ways depending upon API: NULL, -1, < 0, <= 0).
  2. normal and convenient String / Data support for function parameters, including inout parameters, return values, string and data pointers stored in structures, etc. the resulting API is much safer and convenient IRT:
    • memory allocations and ownership
    • memory leaks prevention
    • buffer overflow prevention
  3. array support when convenient, e.g. some API's could be restructured to return arrays instead of linked lists: getifaddrs could return an array, and there is no need to call freeifaddrs on that array.
  4. Int32 is substituted for Int as a more conventient Swift type.
  5. type safe constants (instead of PF_INET, SOCK_DGRAM, etc) – would make API much less error prone.
  6. better swifty naming for those constants (.inet, .datagram, etc).
  7. flags are passed as option sets.
    • as in fopen example presented the mode parameter like "r", "w+", etc are also represented as type safe option sets.
  8. potentially we could prettify the naming (like "creat", mkfifo, etc)
    • this might include external labels for function parameters where deemed needed.
  9. We would have a proper documentation for those "wrapper" API's instead of relying on "man" pages.

Minimal usage example to illustrate some of the above bullet points:

    var resolved = ""
    let result: String = try realPath(path, &resolved)
    let file: File = try fileOpen(path, [.read, .binary])
    let s: Int = try socket(.inet, .stream)
6 Likes
7 Likes

the thread @tera has linked to is about writing extensions to the swift-system package, and being unable to do so due to difficulty safely accessing errno. this is something we only pursue in the first place precisely because swift-system does not support OS interactions that are more complex than basic file reading and writing.

i don’t know if all the APIs proposed here should belong in swift-system, but if they do, they should probably be added to the project roadmap which has not been updated in about 2.5 years.

5 Likes

Of course, the dozens of (incomplete) POSIX wrappers on GitHub alone tell the story. I'm guilty for one or the other one in my projects as well. Alas, finding a canonical representation that makes sense is hard in places. I'd love to see that happening, but it would probably have to be a grass roots initiative ­– judging by the way Swift-System has moved, I wouldn't wait for an Apple framework to reach the critical mass.

1 Like

That thread indicates that the reason swift-system does not expose errno is not because it is out of scope for the package - rather, that the language does not provide the tools necessary to expose it safely. Any other Swift library would have the same difficulty properly exposing errno.

As the roadmap you linked to indicates, the scope of the package is also not limited to basic file reading and writing. And the goals of the package seem to include everything that was asked for.

There are many reasons for swift-system currently being so limited. I suspect some of it, again, has to do with language limitations.


I'll give a concrete example: @tera mentioned getifaddrs. If you're not familiar with that call, here's the linux man page. Basically it returns a linked-list of the following elements:

struct ifaddrs {
    struct ifaddrs  *ifa_next;    /* Next item in list */
    char            *ifa_name;    /* Name of interface */
    unsigned int     ifa_flags;   /* Flags from SIOCGIFFLAGS */
    struct sockaddr *ifa_addr;    /* Address of interface */
    struct sockaddr *ifa_netmask; /* Netmask of interface */
    union {
        struct sockaddr *ifu_broadaddr;
                        /* Broadcast address of interface */
        struct sockaddr *ifu_dstaddr;
                        /* Point-to-point destination address */
    } ifa_ifu;
#define              ifa_broadaddr ifa_ifu.ifu_broadaddr
#define              ifa_dstaddr   ifa_ifu.ifu_dstaddr
    void            *ifa_data;    /* Address-specific data */
};

Notice all of these pointers - not only to the next item in the list, but also strings and other values. When you call freeifaddrs it not only frees the linked list, it frees all of that data as well, making these dangling pointers.

Doing what @tera suggested and eagerly copying the linked-list in to array is therefore insufficient. We would also need to eagerly copy all of this other data (and copying the ifa_data member could be especially awkward). That's a lot of overhead for a low-level system library. Let's say I have a particular interface address and want to find the name of only that interface - the eager version would do a lot of unnecessary work.

Once you start eagerly copying everything, you introduce so many overheads that some people will inevitably get frustrated with it and drop down to using the unsafe C functions directly. It's not a strategy that scales well to all APIs and use-cases.

Alternatively, with better support for non-copyable types, we could pair the getifaddrs/freeifaddrs calls in a non-copyable iterator. The struct shown above could be a non-escaping type with a lifetime dependency on that iterator, with data exposed as more ergonomic types (e.g. exposing name as a String) via computed properties. I think this would be a much better approach - it adds safety and ergonomics to the C interface, without any overheads.

For instance, you'd be able to write something like this:

func getInterfaceName(_ addressToFind: Address) -> String? {
  for info in getifaddrs() /* would probably have a better name */ {
    if info.address == addressToFind {
      return info.interfaceName
    }
  }
  return nil
}

And it would be exactly as fast as if you'd used the unsafe C APIs directly.

(Just to be clear: I don't know if this is the design the swift-system maintainers would choose, but it's the kind of design I would want from it. It's safe, it's ergonomic, it's swifty, it's fast.)

9 Likes

I gave it a spin and getting this timing using a naïve / eager approach:

struct IfAddrs {
    // var ifa_next: UnsafeMutablePointer<ifaddrs>!
    var ifa_name: String
    var ifa_flags: UInt32
    var ifa_addr: sockaddr?
    var ifa_netmask: sockaddr?
    var ifa_dstaddr: sockaddr?
    // var ifa_data: UnsafeMutableRawPointer! // TODO
}

func getifaddrs() -> [IfAddrs] {
    var addrs: UnsafeMutablePointer<ifaddrs>?
    var start = Date()
    let err = getifaddrs(&addrs)
    print("getifaddrs time: \(Int(Date().timeIntervalSince(start)*1000_000)) µs") // 133 µs
    precondition(err == noErr)
    
    start = Date()
    var addr = addrs
    var addresses: [IfAddrs] = []
    
    while addr != nil {
        let p = addr!.pointee
        addresses.append(
            IfAddrs(
                ifa_name: String(cString: p.ifa_name),
                ifa_flags: p.ifa_flags,
                ifa_addr: p.ifa_addr != nil ? p.ifa_addr.pointee : nil,
                ifa_netmask: p.ifa_netmask != nil ? p.ifa_netmask.pointee : nil,
                ifa_dstaddr: p.ifa_dstaddr != nil ? p.ifa_dstaddr.pointee : nil
            )
        )
        addr = p.ifa_next
    }
    print("count: \(addresses.count), enum time \(Int(Date().timeIntervalSince(start)*1000_000)) µs") // count: 52, 20 µs
    
    start = Date()
    freeifaddrs(addrs) // 0.0 µs
    print("freeifaddrs time \(Date().timeIntervalSince(start)*1000_000) µs")
    return addresses
}

_ = getifaddrs()
getifaddrs time: 133 µs
count: 52, enum time 18 µs
freeifaddrs time 0.0 µs

In this test the enumeration time that converts interface names to swift strings is dwarfed by the time of Darwin.getifaddrs call, so while what you wrote above makes sense in theory it probably won't matter in practice in this particular case.


Note that I am not converting ifa_data fields – I never knew how to do it properly... The man page is obviously bogus:

as there is no "struct ifa_data" defined in <net/if.h> or anywhere else.

Could you please expand on that? Is the method I outlined in the post not safe IRT working with errno?

You're missing the point, which is that a lot of C APIs cannot be wrapped in safe abstractions in Swift (yet) without a lot of unwelcome copying.

Even in this case - some random function that you happened to mention, where there a very small number of objects and very short strings (and presumably on a fairly high-end device), the copying overhead necessary to expose a safe interface is 12% of the overall time. I would certainly understand if the package authors wanted to defer adding the API until that kind of overhead could be avoided! (Not that I'm aware of any PR that was rejected for that reason)

We've only just gained the ability to express non-copyable types, and Apple's OS team immediately made use of it by introducing an API for mach ports. Of course, we always wish for more - you're welcome to submit PRs to swift-system, it'll be less work than making a new library which duplicates the effort.

6 Likes

Not necessarily. The approach my NetworkInterfaceInfo package takes is to just wrap each ifaddrs struct with a lightweight Swift struct, which is just a pointer to the raw ifaddrs struct and a reference-counted wrapper over the head pointer to keep the linked list alive as long as any part of it is still in use.

This works quite well in most real use cases (which are basically either keeping the whole list around, or enumerating over it immediately and pulling out the specific data of interest, e.g. all active, routable IPv4 addresses).

You want Swift abstractions over the raw fields anyway, because you want to use native Swift types (e.g. String) and also on Apple platforms getifaddrs is full of bugs, which you want to be worked around for you by the library.

The things that are actually expensive are things like converting to Swift types, so what matters most is doing those only when strictly necessary. The lightweight wrapper means you can still do convenient things like store an actual array of interface infos (rather than being forced to use it only transiently, as with an iterator-only model) but it still costs you very little to iterate over them looking for something specific.

I imagine this is true of many POSIX APIs, once you get beyond the high-profile ones like read.

I also think performance concerns are overblown for most of these APIs; they just don't need to be that fast or efficient. They're not used that often, or (as @tera suggests) there's unavoidable secondary costs tied to their use that swamp a few memcpys and the like. It's much more important to have an elegant, safe, native Swift API. If a small number of use-cases require bypassing that for what they think are valid performance reasons, so be it.

5 Likes