Hello, Swift Community!
Pitch
I'd like to make a pitch to improve the safety situation around conversions of string literals into UnsafePointer / UnsafeRawPointer types for interoperability with C APIs and C data types. Concretely, the proposal is to make UnsafePointer and UnsafeRawPointer conform to ExpressibleByStringLiteral, which can improve safety but also help with constant folding of string literals for example when used to initialize global variables that have a C types.
Motivation
See the following code sample:
// C
struct MyStruct1 {
const char *name;
};
// Swift
let x = MyStruct1(name: "abc") // <== (!) warning: cannot pass 'String' to parameter; argument 'name' must be a pointer that outlives the call to 'init(name:)'
print(String(cString: x.name))
The warning is pointing out a serious problem: The string literal converts to a temporary String object, which is then converted to a pointer via .utf8CString, and there's no guarantee that the final pointer points to the global constant string. In fact, in practice under -Onone the code above ends up producing a dangling pointer. Another example of the same problem can be shown on C APIs that stash constant pointers (where the C side expects that they are pointing to global constants):
// C
const char *current_phase = 0;
void set_current_phase(const char *s) { current_phase = s; }
void print_current_phase(void) { printf("%s\n", current_phase); }
// Swift
set_current_phase("Downloading...") // NO WARNING (!)
print_current_phase()
In this case, no warning is produced by the compiler, because we implicitly expect (without having such guarantee) that the C API doesn't stash the pointer and that there's no use of that pointer after the API has returned. Under -Onone, the code ends up using a dangling pointer in practice (and doesn't print the correct string), and ironically, under -O both these examples end up "working" because the optimizer "saves us" by removing the temporary String object.
While the general problem of safely converting a String into UnsafePointer is hard to solve (because of truly dynamically constructed Strings), the special case of passing a string literal directly to a C API or data structure could and should be handled correctly, intuitively and safely in Swift.
Proposal
The proposal is to:
- add the ExpressibleByStringLiteral conformance (and all the related other conformances) to UnsafePointer and UnsafeRawPointer
- then have the implementation of this conformance directly convert the internal string literal pointer (the "start" Builtin.RawPointer in the _ExpressibleByBuiltinStringLiteral initializer) into UnsafePointer/UnsafeRawPointer
This will mean that in both the code samples listed above, the type checker will prefer the UnsafePointer/UnsafeRawPointer type and because the code involved in that doesn't do any heap allocations / copies, it will guarantee that the pointer passed to the C API or struct is going to be the original constant string pointer.
Note that there are still going to be problematic cases (when not passing a literal directly) that are unaffected by this proposal:
var str = "string" // this is a String
let x = MyStruct1(name: str) // creates a dangling pointer, but produces a warning
set_current_phase(str) // creates a dangling pointer, doesn't produce a warning
set_current_phase("a" + "b") // creates a dangling pointer, doesn't produce a warning
Alternatives
(1) We could instead (or on top of the proposal above) make StaticString, which already is ExpressibleByStringInterpolation, eligible for the string-to-pointer conversion that today only applies to String.
(2) We could instead (or on top of the proposal above) make even String-based string literals always convert to a global constant string pointer, possibly via mandatory optimizations or some other approach in the SIL pipeline.
Thoughts?