<unicode/uregex.h> include magically turns into `import ICU`

Hi!

I'd like to import <unicode/uregex.h> in a Swift package. I have followed this guide on importing system libraries (SPM Docs).

The following setup works, but I don't understand why it does on macOS. I am importing a CUnicodeRegex module (generated via the modulemap, see below). Strangely, when I command-click said CUnicodeRegex module to view the interface, symbols from the shim header would show up, but no symbols from unicode/uregex.h. Instead, the interface lists just import ICU which seems to get inserted automagically. Why and where does this happen? Does the Swift compiler apply special treatment to ICU imports?


Setup:

.
├── Package.swift
└── Sources
    ├── CUnicodeRegex
    │   ├── r.h
    │   └── module.modulemap
    .
    .
module CUnicodeRegex [system] {
  header "r.h"
  link "icucore" ← or icui18n
  export *
}
// r.h

#define U_DISABLE_RENAMING 1
// `- otherwise, uregex_open is re-#define-ed on Linux, and macro 'types' are not supported in Swift

#include <unicode/uregex.h>
// Package.swift
...
.systemLibrary(name: "CUnicodeRegex")
...

Also in Package.swift:
Linux: -l or .linkedLibrary("icui18n")
macOS: -l or .linkedLibrary("icucore")

Haven't tried if this works under the normal compilation mode, I'm using Embedded Swift

The ICU module is defined in the macOS SDK's usr/include/unicode.modulemap (referenced by usr/include/module.modulemap), and includes all of the public headers in usr/include/unicode/. So if you're using the macOS SDK as your SDK root (or you're otherwise adding that directory to your header search paths), ClangImporter will treat an inclusion of <unicode/uregex.h> as an import of the corresponding module, not as a textual inclusion of the symbols in that header.

1 Like

Thanks :+1:

... this also explains why you can import, e.g., unistd_h and get symbols from unistd.h