There has been some discussion in the comments of this PR for adding FreeBSD support about the fact that it is presently abusing the Glibc
module (which really should just be for systems that use Glibc, of which FreeBSD is not one).
During this discussion, the idea was again raised of having some kind of unified C library module so that users don't have to do
#if os(Linux)
import Glibc
#elseif os(macOS) || os(iOS) || ...
import Darwin
#elseif os(Windows)
import CRT
#else
#error We need a C library!
#endif
or similar in any file that needs the C library.
There are already two pitches on this topic. I don't want to hijack the FreeBSD PR and I don't want this discussion to be a pitch, because that would, I think, imply that someone might have time to work on it some time soon, which is not a given, but I do want to talk about what we should do about this problem and the current state of affairs, hence this thread.
I think the people pitching the idea of having some portable C standard library module are right to argue for that. Right now, even after doing the dance above, it's still hard to use C library functions because of types being imported differently — e.g. FILE *
can be imported as OpaquePointer
, or as UnsafeMutablePointer<FILE>
, or indeed as UnsafeMutablePointer<FILE>?
, and exactly which you get depends on precisely how the C library in question defined the type. This makes it difficult to portably use C library functions that, on the C side, are perfectly portable.
The reason for this thread is that although I like the idea of having a portable C standard library module (or modules, actually), it is sadly a lot harder to do that than it looks.
In particular, the way Clang modules work, each C type is defined by exactly one module. Thus if the C library headers declare FILE
, then the module in the C library that defines FILE
must be the only source of a type called FILE
. If another FILE
type is declared elsewhere and some code tries to use it, that will result in compiler errors (e.g. complaining that FILE
was expected to be in Darwin.stdio_h
but was found instead in Libc
or vice-versa). We could in principle define a Swift interface to the C library that contains Swift definitions of all of the relevant functions and types — but then we have another issue, namely what you do if you are trying to interact with some third-party C code that hands you e.g. a FILE *
, which will be whatever the underlying C library defined it as and not necessarily whatever the handwritten Swift bindings have settled upon.
There is a related problem, namely that the Glibc headers in particular are a long way from being properly modularizable, with the result that the Glibc
module is actually quite broken, and this is hard to fix. For the Static SDK for Linux, we actually process the Musl headers slightly in order to make them modularisation-friendly, and it seems not unlikely that we will need to adopt a similar approach for Glibc, though that is also tricky because with the Static SDK we actually ship the headers, whereas for Glibc we need to start with whatever is on the system we're building on. (There is precedent for this approach — for years GCC did precisely this when installing on various proprietary UNIX platforms, running a header-fixing script and generating its own copy of the system headers that took precedence in the include path over the system installed ones.)
I would very much like, personally, to have the ability to do things like
import C99 // Or C11, or C23
to get a standardised interface for the functions and types declared in the relevant C standard. I would also like to see something like
import CExtensions
that gets you the set of common C extensions that are available basically everywhere (e.g. strdup
, strcasecmp
), but with any naming differences erased (so, e.g. strcasecmp
would work on Windows, even though the function there is called _stricmp
); there is quite a sizeable set of these that work essentially identically across the various UNIX platforms and indeed also on DOS/Windows.
A similar strategy could be used for e.g. the POSIX API, though that one is rather larger and might need to be broken up into submodules for performance reasons.
But we need to think carefully about how we would go about implementing this. It is, sadly, very much not as simple as slapping together some new headers and writing a quick module map.
Addendum: There was a suggestion in the PR comments that this is something the Platform Steering Group should take on. PSG is clearly interested in it, but does not, on its own, work directly on things like this — for that, we would need to start a Working Group which would consist of various interested parties, in this case very likely including some PSG members but also clearly members of the wider community.