Pitch: The CStdlib module

I would like to pitch a new module offered by Swift by default. The module would re-export the correct module that contains the POSIX or POSIX-like C standard library for the current platform, if any; it would not be imported by default, but would allow "reasonably cross-platform" code to avoid using lengthy #if canImport(…) chains to gain access to all possible stdlibs, given that they have different names on different OSs.

For example, the module may be named CStdlib. It would behave identically to a module whose entire source was:

#if canImport(Darwin)
@_exported import Darwin
#endif

#if canImport(Glibc)
@_exported import Glibc
#endif

#if canImport(CRT) // Win32
@_exported import CRT
#endif

#if canImport(WASILibc)
@_exported import WASILibc
#endif

This is not meant to overlap with Swift System; it provides no idiomatic types or harmonization between API, but it would help implementors of Swift System, Foundation and similar libraries reduce code complexity and automatically respond to platform additions as those are upstreamed.

Code that requires work on differences that are OS-specific, such as non-portable API, can still use the OS-specific module names; this does not obviate those.

27 Likes

cc @Max_Desiatov.

So to clarify, this won't re-order arguments to handle the potential collisions of those nor will it offer any implementations or shims. Would this have an issue with windows systems that expose their linux layer?

Overall this would be a great quality of life improvement to importing things.

Correct.

WSL is not like MingW/Cygwin, in the sense that POSIX API are not exposed to Windows processes — instead, a virtualized container directly executes Linux executables. Binaries built with Swift for Windows can execute outside of WSL and import the CRT module; binaries built with Swift for Linux can execute in the WSL environment and can import the specific distro's C stdlib as Glibc. While CF has some support for Cygwin, the official Swift for Windows does not build with it.

4 Likes

Nice. +1.

The only downside I can see is that it would look weird if you import CStdlib but use canImport(Darwin) to adjust your call-site for quirks of each implementation. For example, accessing the uint8_t members of an in6_addr:

import CStdlib

#if canImport(Darwin)
  let in6_addr_octets = \in6_addr.__u6_addr.__u6_addr8
#elseif canImport(Glibc)
  let in6_addr_octets = \in6_addr.__in6_u.__u6_addr8
#endif

In C, these sometimes get papered-over using macros that the clang importer doesn't support. For example, see the Darwin header:

typedef struct in6_addr {
	union {
		__uint8_t   __u6_addr8[16];
		__uint16_t  __u6_addr16[8];
		__uint32_t  __u6_addr32[4];
	} __u6_addr;                  
} in6_addr_t;

#define s6_addr   __u6_addr.__u6_addr8 /* <--- clang importer says no */

The other variants are OS-specific; so you can use #if os(Windows) and #if canImport(CRT) interchangeably. For Darwin-family OSes, it's a pain in the neck to list all of the marketing names, so folks tend to prefer canImport(Darwin).

We could perhaps overcome this with an #if os(Darwin) or #if os(Apple), or perhaps if a compile-time define could be set on each branch.

2 Likes

My hope is that the usefulness of this module is there for libraries that 'know what they're doing' — that is, are writing bindings for that behavior and thus know to account for the weirdness of it, rather than as a general tool. I would regard Swift System and Foundation to be the libraries that add idiomatic types, and the ones that I would want people to import rather than CStdlib specifically for this use.

3 Likes

Initially I thought I'd stay neutral, as one argument against it is that all these libraries are sufficiently different from each other. Having an explicitly different import is a good reminder to the user that "here be dragons".

OTOH, the precedent of having different API for different platforms but the same import was already established by frameworks like SwiftUI. And I myself have written a few modules like this, for example our TokamakShim module re-exports SwiftUI on Apple platforms, and platform-appropriate renderer elsewhere. And our CombineShim conditionally re-exports Combine or OpenCombine with the same logic.

So I understand the need for it. And I agree with the point that users importing CStdlib already should know why, how, and what exactly they're doing. Overall it's a +1 from me.

4 Likes

Yes, it isn't a deal-breaker. But it means the utility is truly limited to removing long canImport chains and automatically attempting to compile on new platforms. The code will still look a bit weird as you use canImport to check module availability, without actually importing that module directly.

Still, I think it has value. I would use it, for sure.

1 Like

Would this be something built into the Swift toolchain?
Or could it be part of the Apple package collection?
Perhaps as a new module of the apple/swift-system package?

Should it exclude platform-specific submodules (such as hfs and sys.qos)?

#if canImport(Darwin)
@_exported import Darwin.C
@_exported import Darwin.POSIX
#endif

Otherwise, another module name may be more appropriate than CStdlib.

2 Likes

Since the hope is that this be used in the implementation of both Swift System and similar low-level packages, and also of packages that are already part of the Swift build like Foundation, it would be built into the distributable toolchain and built out of the Swift repository.

This is interesting; I think you have a good idea here, which would solve some of the weirdness above. I'm not sure if all modulemaps we ship correctly segregate submodules in this way, but I would be happy with this suggestion for much the same reason you propose it.

2 Likes

I've done the Darwin/Glibc dance quite a few times, and this would definitely have come in handy, so +1 to the idea from me :ok_hand:

To put a little paint on the shed, maybe going for something like SystemLibC or PlatformCLib or similar could help make it more obvious that things might be different on different platforms?

5 Likes

It'd be great to formalize and standardize this. +1!

4 Likes

I think this would be an immediate improvement to Swift when using any posix api's. Ifdeffing or maintaining this in some private repository all the time is annoying. Shipping this with Swift would also add the benefit when adding support for a new platform would not break on any existing packages that simply want posix. +1 from me

I'm all for this. +1

Bringing up this pitch because as we’re going to support more and more platforms, the urge for having a unified CStdlib is getting stronger and stronger.

Implementation is the major obstacle for getting this into Swift formally, because different Libcs provide slightly different interfaces. Here’s my thought:

  • To eliminate platform differences, we should not export everything from the imported Libc as originally proposed. Instead, it’s better to break them into several modules (identified by specific common headers), and only import & export the common (standard) modules on import CStdlib;
  • CStdlib should support submodules. We can hide an uncommonly used module from import CStdlib and force users to explicitly import them. Availability of explicit-only submodules may differ between platforms, and the compiler should emit a warning if it’s not wrapped in #if canImport(…) judgement.

I would also like to mention some facts happening in Swift:

  • Some members have paid long and dedicated efforts in introducing Swift to Musl platforms (mainly Alpine), which will definitely break the following convention:
#if os(Linux)
import Glibc
#endif
  • We’re confusing Bionic Libc with the name of Glibc now. I cannot remember of how this happened, but it should be corrected because they’re indeed different.
  • A platform (e.g. WASI) may have missing capabilities compared to other platforms, but developers cannot get compiler diagnostics when they’re using unsupported APIs because they’re blocked by #if judgements. By knowing of submodule availability on all the platforms, the compiler can diagnose such usability problems in Libcs.
  • There’re long going debates on whether to use #if os(…) or #if canImport(…) to import C. Neither of them is overwhelmingly superior to the other. It’s time to end such debate by a whole new solution.

Given these facts, I believe it’s the time now to figure out this pitch and clean up the verbose #ifs before new platforms formally landed.

7 Likes

I would like to solicit feedback from the community here:

  • If there were a single C platform library module, and it provided different API surface on different platforms (which I'm pretty sure it has to given how different the C stdlibs are, would that be acceptable to you? How important is to subset or make consistent this API, assuming that this is not the API abstraction or platform abstraction library (tasks better left to System and Foundation)?

In that sense, if that is acceptable, perhaps I've been thinking CPlatform or similar bikeshedding, could be a better name than CStdlib to make clear that there's no specific 'standard' we're referring to.

  • Is there a way that we should allow people to query for specific API, or recommend they use existing tools to?

For example, I've pushed to be more consistent in swift-corelibs-foundation to use canImport(…) wherever possible, partly because the os(…) check for 'Apple OSes' (aka: Darwin plus the closed-source stuff) is already 1. long and 2. not, in the very long term, any likely to become any shorter:

os(macOS) || os(iOS) || os(tvOS) || os(watchOS)

I've been pushing for #if canImport(Darwin) and #if canImport(ObjectiveC) depending on intent and context, but these are imprecise — Darwin's C stdlib module would, for example, be present on a non-Apple distribution of Darwin itself, which I'm not sure how relevant it is.

In general, #if os(…); import as a convention is alarming because it makes portability a moderate nightmare. Little bits keep popping up where we use, for example, TARGET_OS_LINUX in Core Foundation where that same code would apply equally well to the BSDs or other OSes. On the other hand, we had some discussion (I think @compnerd might remember?) about the naming for Glibc specifically and how it has become, because of the differences, API surface in a way we would not gain much to break for both Linuxen and the BSDs.

I'd like to hear more thoughts on this if any interested party has some.

It has been a while so I don't remember the full details. But I believe that is what @stevapple is referring to. Glibc is not really glibc, it is used across all the platforms (other than Windows), and each one has a different interface (and often subtle differences in the signature). More problematically, this makes the ability to support alternative libcs even more challenging (e.g. musl intentionally changes the structure and interface).

As to the naming, I think that CC is better than CPlatform. Ultimately, Darwin is the only platform library there is. The equivalent on Windows would be WinSDK.

On Linux, things are a bit more murky - there is no good definition of a platform, it is a collection of packages someone (the distributor) decides and is dependent on the distributor. The BSDs have this a slight bit better in that there are a handful of them, and each one is a platform in its own right (e.g., FreeBSD vs OpenBSD). While I understand that technically the argument that in itself, that is the definition of the platform, it simply shuffles the problem around - now you need to determine what other pieces you need to import per-distribution on Linux.

Additionally, I am not convinced that a libc alone should be considered the platform. If that is the case, perhaps Darwin should actually become a shell for @_exported import Darwin.C.

For clarity, WASI also uses a different C stdlib name (WASILibc).

1 Like

I’m on keeping CStdlib — but only importing the real standard library (according to Wikipedia, 29 headers so far). If a user want to use any API from a specific libc, he would need to explicitly import it.

4 Likes

Does anyone know what the differences in implementation of the stdlib are amongst platforms?