I would be very happy to define CStdlib
as importing (transitively) as many as those exact headers the C preprocessor can find on the platform. It's important to note that at least some of those headers, despite being part of the standard, are for example not available on Darwin (<uchar.h>
specifically), hence the 'as many as possible'.
I think enumerating the differences would take some effort. But as examples of them: while the standard mandates organization of the declarations, it does not mandate inclusions. So modules can import other modules in different ways. Additionally, the same headers may provide extensions (e..g., the safe functions that Microsoft provides). Many of the annexes of the C standard are optional, as are many of the requirements of the standard itself.
Another example would be the implementation itself is interesting. musl as a matter of principle goes to define stdin
, stdout
, stderr
as macros which when expanded will cause failures if used as identifiers as the standard mantains that should stdio.h
be included, the macros are reserved identifiers even though they are in the user's namespace.
As to more concrete differences, there are language standards, and libraries can partially support a standard. On Windows, VLAs are not supported (outside of x86 for historical reasons). Deprecated functions may also be unavailable more aggressively (e.g. gets
). I know that aligned_alloc
from the C standard is not available on Windows but there is an extension to support the aligned allocation behaviour. DRs can cause partial implementation of interfaces as well.
However, I would agree that having CStdlib
only include the submodules from the C standard is a good goal to aim for, though I am concerned that it may accidentally import something else due to dependencies in the C library headers.
I think that that sort of gives us a way out â by defining it in terms of importing specific headers from the platform, we can give a pretty precise idea of the semantics of this module rather than nebulously referring to a standard, justifying the varying interface.
Two problems I foresee:
-
Someone uses API surface in a leaf module that is imported by CStdlib only in one or a few OSes. Any competent multiplatform CI for the leaf module will catch that.
-
Someone uses API surface that is not present in only some OSes. Is it fair to ask people to use
canImport(âŚ)
oros(âŚ)
(which one?) for this?
I donât have ready examples of these, though.
I took a look at the system modules (Darwin, WinSDK & visualc & ucrt, Glibc), and indeed feel that there're lots of chaos and fixing them can be unavoidably breaking...
Clear directions (from my perspective):
- Group
Glibc
headers into modules likeDarwin
andWinSDK
already do; - Replace
ucrt
andvisualc
asWinSDK.C
. Unstandardized parts fall intoWinSDK
.
Future directions:
- Given that Bionic only governs Android, maybe we should directly use
NDK
for the main namespace (eg.NDK.C
andNDK.POSIX
); - Add
#if platform()
judgement for dominant ABI instead of#if canImport()
, the possible names are: BSD, Darwin, GNU, Musl, Android, MSVC, WASI(, Haiku?). This is mainly for OSes like Windows and Linux, which has variants of other ABIs. That's what I came up with to solveGlibc
/Musl
imports and judgement for Apple platform.
I would specifically reach out to @compnerd, for your opinion on restructuring how we import system libraries on Windows.
To make a unified WinSDK
possible, we need to maintain symbolic links into %UniversalCRTSdkDir%\Include\%UCRTVersion%
and %VCToolsInstallDir%\include
, where we place module.modulemap
alongside. I believe that's natural and reasonable for a pluggable Windows.sdk
, and is possible to achieve.
My two bits: before we ship an official CStdlib module, I think we'd need to audit and Swiftify the APIs it contains.
I agree on figuring out what surface we are exposing. I disagree on âswiftifyingâ: this is the scope and domain of System and marginally of Foundation, and it is better left to those two libraries. (This is, in part, for ease of implementation in specifically Foundation.)
I don't mean a full revamp as if libc were a native Swift module. I do think that certain C functions and interfaces are very "C" and there's sharp edges we can improve. For example, bsearch()
could be overloaded in Swift such that you can pass it a Swift function/closure instead of needing to pass a @convention(c)
function and a cookie pointer.
Or, C has an errno_t
type that is a typedef in C but could be made a proper struct in Swift and which could conform to Error
, allowing throwing errno
values without needing to import Foundation (to get POSIXError
.) Or maybe we could migrate POSIXError
to this module and map errno_t
directly to it.
If we had all the time in the worldâŚ
- I'd be interested in making the calling convention for C functions more consistent: some functions use
errno
to report errors, others use their return values. Some return0
on success, non-zero on failure. Others return specific magic values. It'd be worth exploring if we could expose these functions consistently (and maybe have errorful functions become throwing.) - There's value in replacing some of C's magic numbers with
Optional
. For example,fread()
returns anssize_t
with negative values indicating an errorâit could instead be imported asfunc fread(...) throws -> Int
. - I'd want to think about C strings in APIâcan we import functions that take C strings as if they take Swift strings? But that's a fairly large surface area and worth its own discussion.
Some of what you propose already occurs â for example, we import (or define) the errno error type as POSIXError
, and it does in fact conform to Error
.
I think your requests are basically re-describing the mission statement for System, though? I really donât want to either propose duplication of their work nor a requirement for this library to be anything but a smart way for lower-level modules to obtain access to platform API to implement elsewhere what you describe.
Fair enough. As I said, it's my two bits.
I want to confirm the goal of this feature. As far as I understand, the goal is to provide a standard way to import target-system specific libc declarations, and not provide a cross-platform abstraction layer, right?
Assuming the above, I agree that we need this sort of shim module for interacting with low-level APIs. It was not clear for those without knowledge of the various platforms, which module corresponds to libc-like module on each platform.
I think we can think importing the CStdlib corresponds to #include, and delegate the responsibility of portability to import-site and libc itself as C manner does. So it feels acceptable for me to provide different APIs between platforms.
For the sub-modularization of CStdlib, I think we don't need to have our own grouping rule because we are not aiming to provide an abstraction layer. We can just use the header name as submodule names.
FWIW there was a similar discussion in Rust community for libc crate. https://github.com/rust-lang/rfcs/blob/master/text/1291-promote-libc.md
Youâre right. Weâre not talking about any abstraction layer here. CStdlib
is just a shortcut to import headers from the C standard library. That is, import CStdlib.stdio
should work like #include <stdio.h>
.
However, thereâre obviously mixed goals across this long-going discussion thread. The seemingly simple goal of CStdlib
is blocked by plenty of others. I would propose a possible and step-by-step path to demonstrate this.
Standardize and modularize Glibc
Currently, Glibc
is imported through SwiftGlibc.h
, which only provides limited access to Glibc functionalities. Also, in this way we cannot modularize Glibc
as we did for system libraries on Darwin and Windows. SwiftGlibc
is intended to be replaced by a standardized Glibc
module map, but the work has been delayed for years.
Re-organize system libraries to a Darwin-like module
We have been specializing Glibc
and making this name an umbrella for any libc from except Darwin and Windows. However, this is not true. Different libcs provides different sets of functionalities, and we should allow consuming all possible APIs from the system library.
Also, the way we load Glibc
today is not standardized, while the Windows practice has already provided a standardized way to do such thing. On how to organize the libc module (maybe better expressed as system SDK module), I suggest referring to Darwin.modulemap
since thatâs really neat.
Re-export <SystemSDK>.C
as CStdlib
After the two preparation work, we finally can provide CStdlib
on supported platforms. The CStdlib
module should be exactly the same as C
submodule from system SDK. However, clang doesnât allow submodules to be re-exported. If this restriction cannot be removed, we can make the convention inside Swift compiler before it reach for clang to resolve the module.
I want to especially point out that this step is vital yet very breaking. It will eventually change the system SDK module name for nearly every target thatâs not using Glibc
. To be frank, this is the major concern that is driving me to push and land the work before Swift 6.0.
I think it's time we got this over the finish line, given that the boilerplate libc import block keeps getting larger. I would like to split off Bionic soon, but I don't want to add a similar import Bionic
at the top of every file, prefering to consolidate everything to import CStdlib
first.
What do we have to do to get this finalized?