Ugliness bridging Swift String to char *

Hi All.

Swift automatically bridges String to char * when calling C functions. For instance, strlen gets translated as:

    public func strlen(_ __s: UnsafePointer<Int8>!) -> UInt

I can call it from Swift like this:

    strlen("|")

I’m But, I’m working with a C struct containing a char *:

    public struct _PQprintOpt {
        public var header: pqbool /* print output field headings and row count */
        public var align: pqbool /* fill align the fields */
        public var fieldSep: UnsafeMutablePointer<Int8>! /* field separator */
        ...
    }
    public typealias PQprintOpt = _PQprintOpt

When I try to assign to fieldSep like this:

    opt.fieldSep = "|"

I get the error:

    Cannot assign value of type 'String' to type 'UnsafeMutablePointer<Int8>!'
    
I assume that the difference is that strlen declares const char * and fieldSep is simply char *, so strlen is non-mutable while fieldSep is mutable. Is this correct?

I currently have this ugly hack to get this to work:

    var opt :PQprintOpt = PQprintOpt()
    guard let fieldSeparator = "|".cString(using: .utf8) else {
        throw Errors.databaseConnectionError("Could not set field separator")
    }
    opt.fieldSep = UnsafeMutablePointer(mutating:fieldSeparator)

Is there a cleaner way this could work, or should this be considered a compiler bug?

Also, why is the conversion to Swift an IUO? NULL is a totally valid value for fieldSep.

Thanks!

-Kenny

Hi All.

Swift automatically bridges String to char * when calling C functions. For instance, strlen gets translated as:

    public func strlen(_ __s: UnsafePointer<Int8>!) -> UInt

I can call it from Swift like this:

    strlen("|")

I’m But, I’m working with a C struct containing a char *:

    public struct _PQprintOpt {
        public var header: pqbool /* print output field headings and row count */
        public var align: pqbool /* fill align the fields */
        public var fieldSep: UnsafeMutablePointer<Int8>! /* field separator */
        ...
    }
    public typealias PQprintOpt = _PQprintOpt

When I try to assign to fieldSep like this:

    opt.fieldSep = "|"

I get the error:

    Cannot assign value of type 'String' to type 'UnsafeMutablePointer<Int8>!'
    
I assume that the difference is that strlen declares const char * and fieldSep is simply char *, so strlen is non-mutable while fieldSep is mutable. Is this correct?

I currently have this ugly hack to get this to work:

    var opt :PQprintOpt = PQprintOpt()
    guard let fieldSeparator = "|".cString(using: .utf8) else {
        throw Errors.databaseConnectionError("Could not set field separator")
    }
    opt.fieldSep = UnsafeMutablePointer(mutating:fieldSeparator)

Is there a cleaner way this could work, or should this be considered a compiler bug?

Hey, Kenny. The const vs non-const part is important, since both the implicit conversion and cString(using:) are allowed to return a pointer to the internal data being used by the String, and modifying that would be breaking the rules (and could potentially cause a crash). In the case of 'fieldSep', it's unlikely that anyone is going to mutate the contents of the string; it was probably just the original author (you?) not bothering to be const-correct. If you control this struct, a better fix would be to use 'const char *' for the field.

That said, that's not the main isuse. The implicit conversion from String to UnsafePointer<CChar> is only valid when the string is used as a function argument, because the conversion might need to allocate temporary storage. In that case, it’s important to know when it’s safe to deallocate that storage. For a function call, that’s when the call returns, but for storing into a struct field it’s completely unbounded. So there’s no implicit conversion there.

If you’re willing to limit your use of the pointer value to a single block of code, you can use withCString <https://developer.apple.com/reference/swift/string/1538904-withcstring&gt;:

myString.withCString {
  var opt :PQprintOpt = PQprintOpt()
  opt.fieldSep = UnsafeMutablePointer(mutating: $0)
  // use 'opt'
}

Note that it is illegal to persist the pointer value beyond the execution of withCString, because it might point to a temporary buffer.

Alternately, if you’re willing to call free() later, you can use the C function strdup:

opt.fieldSep = strdup(myString)
// use ‘opt’
free(opt.fieldSep)

Of course, all of this is overkill for a string literal. Perhaps when Swift gets conditional conformances, we could consider making UnsafePointer<CChar> conform to ExpressibleByStringLiteral. Meanwhile, you can use the type designed specifically for this purpose, StaticString <https://developer.apple.com/reference/swift/staticstring&gt;:

let separator: StaticString = “|”
opt.fieldSep = UnsafeMutablePointer(mutating: separator.utf8start)

Note that you still have to do the init(mutating:) workaround for the fact that you’re not allowed to mutate this string.

Also, why is the conversion to Swift an IUO? NULL is a totally valid value for fieldSep.

You're welcome to mark that field as _Nullable on the C side, but without annotations Swift makes very few assumptions about pointers imported from C.

Hope that helps!
Jordan

···

On Mar 1, 2017, at 14:23, Kenny Leung via swift-users <swift-users@swift.org> wrote:

Hi Jordan.

Thanks for the lengthy answer.

Hey, Kenny. The const vs non-const part is important, since both the implicit conversion and cString(using:) are allowed to return a pointer to the internal data being used by the String, and modifying that would be breaking the rules (and could potentially cause a crash). In the case of 'fieldSep', it's unlikely that anyone is going to mutate the contents of the string; it was probably just the original author (you?) not bothering to be const-correct. If you control this struct, a better fix would be to use 'const char *' for the field.

This is the PostgreSQL client library, so I don’t really want to change it. (Although the source is available. Maybe I should submit a patch…)

That said, that's not the main isuse. The implicit conversion from String to UnsafePointer<CChar> is only valid when the string is used as a function argument, because the conversion might need to allocate temporary storage. In that case, it’s important to know when it’s safe to deallocate that storage. For a function call, that’s when the call returns, but for storing into a struct field it’s completely unbounded. So there’s no implicit conversion there.

If you’re willing to limit your use of the pointer value to a single block of code, you can use withCString <https://developer.apple.com/reference/swift/string/1538904-withcstring&gt;:

myString.withCString {
  var opt :PQprintOpt = PQprintOpt()
  opt.fieldSep = UnsafeMutablePointer(mutating: $0)
  // use 'opt'
}

Unfortunately, this is not always an option, since there are multiple char * files in PQprintOpt. Or could I just nest .withCString calls? Looks ugly, but might work, I guess.

Note that it is illegal to persist the pointer value beyond the execution of withCString, because it might point to a temporary buffer.

Alternately, if you’re willing to call free() later, you can use the C function strdup:

opt.fieldSep = strdup(myString)
// use ‘opt’
free(opt.fieldSep)

Of course, all of this is overkill for a string literal. Perhaps when Swift gets conditional conformances, we could consider making UnsafePointer<CChar> conform to ExpressibleByStringLiteral. Meanwhile, you can use the type designed specifically for this purpose, StaticString <https://developer.apple.com/reference/swift/staticstring&gt;:

let separator: StaticString = “|”
opt.fieldSep = UnsafeMutablePointer(mutating: separator.utf8start)

Note that you still have to do the init(mutating:) workaround for the fact that you’re not allowed to mutate this string.

Ah - that's

Also, why is the conversion to Swift an IUO? NULL is a totally valid value for fieldSep.

You're welcome to mark that field as _Nullable on the C side, but without annotations Swift makes very few assumptions about pointers imported from C.

If it’s “making very few assumptions”, I would think that, for safety’s sake, functions that return a pointer would always be optional, forcing the user to deal with any possible null pointer returns.

Hope that helps!

Definitely!

-Kenny

···

On Mar 1, 2017, at 6:21 PM, Jordan Rose <jordan_rose@apple.com> wrote:

Talking myself into a circle there. We’re not talking about return values, but the value of PQprintOpt.fieldSep.

Also interesting - I thought being a IUO meant that you could not set the value to nil, but that’s not true. You can!

-Kenny

···

On Mar 1, 2017, at 7:08 PM, Kenny Leung via swift-users <swift-users@swift.org> wrote:

If it’s “making very few assumptions”, I would think that, for safety’s sake, functions that return a pointer would always be optional, forcing the user to deal with any possible null pointer returns.

Here are some basic recommendations for converting between String and C string representations:
https://swift.org/migration-guide/se-0107-migrate.html#common-use-cases

The best ways to convert Swift to C strings are:

1. Pass the Swift string as a function argument of type
   Unsafe[Mutable]Pointer<Int8>.

2. Use `String.withCString` to create a block of Swift code that can
   access the C string.

That doesn't help you if you want to store a pointer to the CString in a property without defining its scope. In that case, you just need to explicitly copy the string.

I like Jordan's recommendation for calling strdup. That's the canonical way of creating a new C string with its own lifetime.

Guillaume's explanation with withMemoryRebound(to:) is also correct, if you want to work at that level.

We probably should provide an API that handles the special case of String literals so you don't need to copy. As Jordan suggested, that we could simply add a `cstring` property to StaticString. Feel free to file a bug for that so it isn't forgotten.

-Andy

···

On Mar 1, 2017, at 7:08 PM, Kenny Leung via swift-users <swift-users@swift.org> wrote:

Hi Jordan.

Thanks for the lengthy answer.

On Mar 1, 2017, at 6:21 PM, Jordan Rose <jordan_rose@apple.com <mailto:jordan_rose@apple.com>> wrote:

Hey, Kenny. The const vs non-const part is important, since both the implicit conversion and cString(using:) are allowed to return a pointer to the internal data being used by the String, and modifying that would be breaking the rules (and could potentially cause a crash). In the case of 'fieldSep', it's unlikely that anyone is going to mutate the contents of the string; it was probably just the original author (you?) not bothering to be const-correct. If you control this struct, a better fix would be to use 'const char *' for the field.

This is the PostgreSQL client library, so I don’t really want to change it. (Although the source is available. Maybe I should submit a patch…)

That said, that's not the main isuse. The implicit conversion from String to UnsafePointer<CChar> is only valid when the string is used as a function argument, because the conversion might need to allocate temporary storage. In that case, it’s important to know when it’s safe to deallocate that storage. For a function call, that’s when the call returns, but for storing into a struct field it’s completely unbounded. So there’s no implicit conversion there.

If you’re willing to limit your use of the pointer value to a single block of code, you can use withCString <https://developer.apple.com/reference/swift/string/1538904-withcstring&gt;:

myString.withCString {
  var opt :PQprintOpt = PQprintOpt()
  opt.fieldSep = UnsafeMutablePointer(mutating: $0)
  // use 'opt'
}

Unfortunately, this is not always an option, since there are multiple char * files in PQprintOpt. Or could I just nest .withCString calls? Looks ugly, but might work, I guess.

Follow-up on this:

The StaticString solution doesn’t work:

let separator: StaticString = “|”
opt.fieldSep = UnsafeMutablePointer(mutating: separator.utf8start)

… because utf8start returns UnsafePointer<UInt8>, and fieldSep is actually UnsafeMutablePointer<Int8>. There doesn’t seem to be any way to convert UnsafePointer<Unit8> to UnsafePointer<Int8>.

cString(using:) works because it returns [CChar], which is the same as Int8.

-Kenny

There is: .withMemoryRebound()

var opt = PQPrintOpt()
let sep2: StaticString = "|"
opt.fieldSep = sep2.utf8Start.withMemoryRebound(to: Int8.self, capacity: sep2.utf8CodeUnitCount) {
  buffer in
  let p = UnsafeMutablePointer<Int8>.allocate(capacity: sep2.utf8CodeUnitCount)
  p.assign(from: buffer, count: sep2.utf8CodeUnitCount)
  return p
}

// use opt

opt.fieldSep.deinitialize(count: sep2.utf8CodeUnitCount)
opt.fieldSep.deallocate(capacity: sep2.utf8CodeUnitCount)

Cheers,
Guillaume Lessard

···

On Mar 6, 2017, at 12:28 AM, Kenny Leung via swift-users <swift-users@swift.org> wrote:

Follow-up on this:

The StaticString solution doesn’t work:

let separator: StaticString = “|”
opt.fieldSep = UnsafeMutablePointer(mutating: separator.utf8start)

… because utf8start returns UnsafePointer<UInt8>, and fieldSep is actually UnsafeMutablePointer<Int8>. There doesn’t seem to be any way to convert UnsafePointer<Unit8> to UnsafePointer<Int8>.