Opaque Pointers in Swift


(Cory Benfield) #1

I wanted to discuss a recent difficulty I’ve encountered while writing a Swift program that uses a C library that has recently changed its API to use opaque pointers, with an eye towards asking whether there are suggestions for ways to tackle the problem that I haven’t considered, or whether some enhancement to Swift should be proposed to provide a solution.

A common trend in modern C code is to encapsulate application data by using pointers to “opaque” data structures: that is, data structures whose complete definition is not available in the header files for the library. This has many benefits from the perspective of library developers, mostly notably because it limits the ABI of the library, making it easier to change the internals without requiring recompilation or breaking changes. Pointers to these structures are translated into Swift code in the form of the OpaquePointer type.

Older C libraries frequently have non-opaque structures: that is, the structure definition is available in the header files for the library. When using code like this from Swift, pointers to these structures are translated to Unsafe[Mutable]Pointer<T>.

Both of these cases are well-handled by Swift today: opaque pointers correctly can do absolutely nothing, whereas typed pointers have the option of having behaviour based on knowing about the size of the data structure to which they point. All very good.

A problem arises if a C dependency chooses to transition from non-opaque to opaque structures. This is a transition that well-maintained C libraries are strongly incentivised to make, but if you want to write Swift code that will compile against both the old and new version of the library you run into substantial issues. To illustrate the issue I’ll construct a small problem based on the most widely-used library to recently make this transition, OpenSSL.

In OpenSSL 1.1.0 almost all of the previously-open data structures were made opaque, including the heavily used SSL_CTX structure. In terms of C code, the header file declaration changed from

  struct ssl_ctx_st {
        const SSL_METHOD *method;
    // snip 250 lines of structure declaration
  }
  typedef struct ssl_ctx_st SSL_CTX;

to

  typedef struct ssl_ctx_st SSL_CTX;

At an API level, any function that worked on the SSL_CTX structure that existed before this change was unaffected. For example, the function SSL_CTX_use_certificate has the same C API in both versions:

  int SSL_CTX_use_certificate(SSL_CTX *ctx, X509 *x);

Unfortunately, in Swift the API for this function changes dramatically, from

  func SSL_CTX_use_certificate(_ ctx: UnsafeMutablePointer<SSL_CTX>!,
                                                    _ x: UnsafeMutablePointer<X509>!) -> Int32

to

  func SSL_CTX_use_certificate(_ ctx: OpaquePointer!,
                                                    _ x: OpaquePointer!) -> Int32

The reason this is problematic is that there is no implicit cast in either direction between UnsafeMutablePointer<T> and OpaquePointer. This means the API here has changed in an incompatible way: types that are valid before the structure was made opaque are not valid afterwards. This adds a pretty substantial burden to supporting multiple versions of the same library from Swift code.

So far I have thought of the following solutions to this problem that I can implement today:

1. Write a C wrapper library that exposes a third, consistent type that is the same on all versions. Most likely this would be done by re-exposing all these methods with arguments that take `void *` and performing the cast in C code. This, unfortunately, loses some of the Swift compiler’s ability to enforce type safety, as all these arguments will now be UnsafeRawPointer. This is not any worse than OpaquePointer, but it’s objectively worse than the un-opaqued version.

2. Write a C wrapper library that embeds these pointers in single, non-opaque structures with separate types. This allows us to keep the type safety at the cost of verbosity and an additional layer of indirection.

3. Write two different Swift wrappers for each of these versions that expose the same outer types, and transform them internally. Not ideal: the conditional compilation story here isn’t good and distributing this library via Swift PM would be hard.

I’d be really interested in hearing whether there is a solution I’m missing that can be implemented today. If there is *not* such a solution, is there interest in attempting to tackle this problem in Swift more directly? There are plenty of language changes that could be made to solve this solution (e.g. changing OpaquePointer to OpaquePointer<T> but making it impossible to dereference, then treating UnsafePointer<T> as a subclass of OpaquePointer<T>, or making OpaquePointer a protocol implemented by UnsafePointer<T>, or all kinds of other things), but I wanted to hear from the community about suggested approach.

I’d love not to have to manually maintain a C wrapper just to escape Swift’s type system here.

Thanks,

Cory


C Interoperability: Import "struct Incomplete *" as Unsafe(Mutable)RawPointer rather than OpaquePointer
#2

Could you do a conditional typealias?

#if OPENSSL_HAS_OPAQUE_POINTERS
typealias OpenSSLContext = OpaquePointer<SSL_CTX>
#else
typealias OpenSSLContext = UnsafeMutablePointer<SSL_CTX>
#endif

Then as long as you don't need to build one single OpenSSL that supports both, you'll be fine (although it might work anyway, I think).

A problem with OpaquePointer<T> is that it forces T into existence. Swift doesn't have a notion of an incomplete type: if you write "struct foo;" in a header and bridge it to Swift, no "foo" type is imported. And of course, if it's supported, you shouldn't be able to instantiate one, and you shouldn't be able to get a type of pointer that is not an OpaquePointer to them, because you'd be able to dereference them. (Maybe they could be imported as enums with no cases? That's how we've been doing inconstructible types so far. Not sure if it would have strange implications.)

Félix

···

Le 24 oct. 2017 à 01:14, Cory Benfield via swift-evolution <swift-evolution@swift.org> a écrit :

I wanted to discuss a recent difficulty I’ve encountered while writing a Swift program that uses a C library that has recently changed its API to use opaque pointers, with an eye towards asking whether there are suggestions for ways to tackle the problem that I haven’t considered, or whether some enhancement to Swift should be proposed to provide a solution.

A common trend in modern C code is to encapsulate application data by using pointers to “opaque” data structures: that is, data structures whose complete definition is not available in the header files for the library. This has many benefits from the perspective of library developers, mostly notably because it limits the ABI of the library, making it easier to change the internals without requiring recompilation or breaking changes. Pointers to these structures are translated into Swift code in the form of the OpaquePointer type.

Older C libraries frequently have non-opaque structures: that is, the structure definition is available in the header files for the library. When using code like this from Swift, pointers to these structures are translated to Unsafe[Mutable]Pointer<T>.

Both of these cases are well-handled by Swift today: opaque pointers correctly can do absolutely nothing, whereas typed pointers have the option of having behaviour based on knowing about the size of the data structure to which they point. All very good.

A problem arises if a C dependency chooses to transition from non-opaque to opaque structures. This is a transition that well-maintained C libraries are strongly incentivised to make, but if you want to write Swift code that will compile against both the old and new version of the library you run into substantial issues. To illustrate the issue I’ll construct a small problem based on the most widely-used library to recently make this transition, OpenSSL.

In OpenSSL 1.1.0 almost all of the previously-open data structures were made opaque, including the heavily used SSL_CTX structure. In terms of C code, the header file declaration changed from

  struct ssl_ctx_st {
       const SSL_METHOD *method;
    // snip 250 lines of structure declaration
  }
  typedef struct ssl_ctx_st SSL_CTX;

to

  typedef struct ssl_ctx_st SSL_CTX;

At an API level, any function that worked on the SSL_CTX structure that existed before this change was unaffected. For example, the function SSL_CTX_use_certificate has the same C API in both versions:

  int SSL_CTX_use_certificate(SSL_CTX *ctx, X509 *x);

Unfortunately, in Swift the API for this function changes dramatically, from

  func SSL_CTX_use_certificate(_ ctx: UnsafeMutablePointer<SSL_CTX>!,
                                                    _ x: UnsafeMutablePointer<X509>!) -> Int32

to

  func SSL_CTX_use_certificate(_ ctx: OpaquePointer!,
                                                    _ x: OpaquePointer!) -> Int32

The reason this is problematic is that there is no implicit cast in either direction between UnsafeMutablePointer<T> and OpaquePointer. This means the API here has changed in an incompatible way: types that are valid before the structure was made opaque are not valid afterwards. This adds a pretty substantial burden to supporting multiple versions of the same library from Swift code.

So far I have thought of the following solutions to this problem that I can implement today:

1. Write a C wrapper library that exposes a third, consistent type that is the same on all versions. Most likely this would be done by re-exposing all these methods with arguments that take `void *` and performing the cast in C code. This, unfortunately, loses some of the Swift compiler’s ability to enforce type safety, as all these arguments will now be UnsafeRawPointer. This is not any worse than OpaquePointer, but it’s objectively worse than the un-opaqued version.

2. Write a C wrapper library that embeds these pointers in single, non-opaque structures with separate types. This allows us to keep the type safety at the cost of verbosity and an additional layer of indirection.

3. Write two different Swift wrappers for each of these versions that expose the same outer types, and transform them internally. Not ideal: the conditional compilation story here isn’t good and distributing this library via Swift PM would be hard.

I’d be really interested in hearing whether there is a solution I’m missing that can be implemented today. If there is *not* such a solution, is there interest in attempting to tackle this problem in Swift more directly? There are plenty of language changes that could be made to solve this solution (e.g. changing OpaquePointer to OpaquePointer<T> but making it impossible to dereference, then treating UnsafePointer<T> as a subclass of OpaquePointer<T>, or making OpaquePointer a protocol implemented by UnsafePointer<T>, or all kinds of other things), but I wanted to hear from the community about suggested approach.

I’d love not to have to manually maintain a C wrapper just to escape Swift’s type system here.

Thanks,

Cory
_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


(Johannes Weiss) #3

Hi Cory,

I think we're dealing with two separate issues here.

1) that all forward declared struct pointers get imported as an OpaquePointer which makes us lose all type-safety
2) that it's a fairly frequent case that C libraries evolve from 'pointers to fully declared structs' to 'pointers to forward declared structs'

Regarding 1)

···

------------
I fully agree that is pretty bad and I believe I have an idea what the Swift importer should do for that case:

For the following C types

    struct foo_s;
    typedef foo_s *foo;

    struct bar_s;
    typedef bar_s *bar;

the C compiler today imports both `foo` and `bar` as `OpaquePointer` which isn't very helpful. Instead I believe the importer should declare to phantom types

    enum foo_s {}
    enum bar_s {}

and import `foo` as `typealias foo = OpaquePointer<foo_s>` and `bar` as `typealias bar = OpaquePointer<bar_s>`.

That seems to preserve the right semantics:

- we can't have any values of `foo_s` or `bar_s` as they're phantom
- we can however have pointers to them
- in this example the pointers are `OpaquePointer<foo_s>` and `OpaquePointer<bar_s>`, deliberately not `UnsafePointer<foo_s>` and `UnsafePointer<bar_s>` so we don't have an issue that the user can do 'unsafePtr.pointee'

How do people think about this proposed change?

Regarding 2)
------------

I feel you'd prefer if we'd import the above as `typealias foo = UnsafePointer<foo_s>` which I would be happy with but I can also see why `OpaquePointer` exists: It stops us from dereferencing the pointer at compile time (just like C does).

So yes, I agree we should fix (1). For the time being I believe I have something that might work for you use-case even today (and would work even nicer if the above changes are implemented):

Today, we already have the following initialisers

struct UnsafePointer<T> {
    init?(_ ptr: OpaquePointer)
}

and

struct OpaquePointer {
    init?<T>(_ ptr: UnsafePointer<T>)
}

by just adding the following two trivial ones:

extension UnsafePointer<T> {
    init?(_ ptr: UnsafePointer<T>) { return ptr }
}
extension OpaquePointer {
    init?<T>(_ ptr: OpaquePointer) { return ptr }
}

this should solve your problem:

- when you receive a pointer from the C library, store it as OpaquePointer
    let myOpaquePointer: OpaquePointer = OpaquePointer(c_library_create_function())
  which now should work regardless of whether you're linking the old or the new version of the C library
- when you pass a pointer to the C library:
    c_library_consuming_function(.init(myOpaquePointer))
  which should (modulo it doesn't right now https://bugs.swift.org/browse/SR-6211) select the right initialiser for you, only it doesn't :astonished:

But fortunately we can work around it:

--- SNIP ---
extension UnsafePointer {
    static func make(_ ptr: UnsafePointer<Pointee>) -> UnsafePointer<Pointee> {
        return ptr
    }
    static func make(_ ptr: OpaquePointer) -> UnsafePointer<Pointee> {
        return UnsafePointer(ptr)
    }
}
extension OpaquePointer {
    static func make<T>(_ ptr: UnsafePointer<T>) -> OpaquePointer {
        return OpaquePointer(ptr)!
    }
    static func make(_ ptr: OpaquePointer) -> OpaquePointer {
        return ptr
    }
}

func mockCLibraryCreateOld() -> UnsafePointer<Int> {
    return UnsafePointer(UnsafeMutablePointer<Int>.allocate(capacity: 1))
}

func mockCLibraryCreateNew() -> OpaquePointer {
    return OpaquePointer(mockCLibraryCreateOld())
}

func mockCLibraryConsumeOld(_ x: UnsafePointer<Int>) {}
func mockCLibraryConsumeNew(_ x: OpaquePointer) {}

let fromCold: OpaquePointer = .make(mockCLibraryCreateOld())
let fromCnew: OpaquePointer = .make(mockCLibraryCreateNew())

mockCLibraryConsumeOld(.make(fromCold))
mockCLibraryConsumeNew(.make(fromCnew))
mockCLibraryConsumeOld(.make(fromCnew))
mockCLibraryConsumeNew(.make(fromCold))
--- SNAP ---

HTH

-- Johannes

On 24 Oct 2017, at 9:14 am, Cory Benfield via swift-evolution <swift-evolution@swift.org> wrote:

I wanted to discuss a recent difficulty I’ve encountered while writing a Swift program that uses a C library that has recently changed its API to use opaque pointers, with an eye towards asking whether there are suggestions for ways to tackle the problem that I haven’t considered, or whether some enhancement to Swift should be proposed to provide a solution.

A common trend in modern C code is to encapsulate application data by using pointers to “opaque” data structures: that is, data structures whose complete definition is not available in the header files for the library. This has many benefits from the perspective of library developers, mostly notably because it limits the ABI of the library, making it easier to change the internals without requiring recompilation or breaking changes. Pointers to these structures are translated into Swift code in the form of the OpaquePointer type.

Older C libraries frequently have non-opaque structures: that is, the structure definition is available in the header files for the library. When using code like this from Swift, pointers to these structures are translated to Unsafe[Mutable]Pointer<T>.

Both of these cases are well-handled by Swift today: opaque pointers correctly can do absolutely nothing, whereas typed pointers have the option of having behaviour based on knowing about the size of the data structure to which they point. All very good.

A problem arises if a C dependency chooses to transition from non-opaque to opaque structures. This is a transition that well-maintained C libraries are strongly incentivised to make, but if you want to write Swift code that will compile against both the old and new version of the library you run into substantial issues. To illustrate the issue I’ll construct a small problem based on the most widely-used library to recently make this transition, OpenSSL.

In OpenSSL 1.1.0 almost all of the previously-open data structures were made opaque, including the heavily used SSL_CTX structure. In terms of C code, the header file declaration changed from

  struct ssl_ctx_st {
       const SSL_METHOD *method;
    // snip 250 lines of structure declaration
  }
  typedef struct ssl_ctx_st SSL_CTX;

to

  typedef struct ssl_ctx_st SSL_CTX;

At an API level, any function that worked on the SSL_CTX structure that existed before this change was unaffected. For example, the function SSL_CTX_use_certificate has the same C API in both versions:

  int SSL_CTX_use_certificate(SSL_CTX *ctx, X509 *x);

Unfortunately, in Swift the API for this function changes dramatically, from

  func SSL_CTX_use_certificate(_ ctx: UnsafeMutablePointer<SSL_CTX>!,
                                                    _ x: UnsafeMutablePointer<X509>!) -> Int32

to

  func SSL_CTX_use_certificate(_ ctx: OpaquePointer!,
                                                    _ x: OpaquePointer!) -> Int32

The reason this is problematic is that there is no implicit cast in either direction between UnsafeMutablePointer<T> and OpaquePointer. This means the API here has changed in an incompatible way: types that are valid before the structure was made opaque are not valid afterwards. This adds a pretty substantial burden to supporting multiple versions of the same library from Swift code.

So far I have thought of the following solutions to this problem that I can implement today:

1. Write a C wrapper library that exposes a third, consistent type that is the same on all versions. Most likely this would be done by re-exposing all these methods with arguments that take `void *` and performing the cast in C code. This, unfortunately, loses some of the Swift compiler’s ability to enforce type safety, as all these arguments will now be UnsafeRawPointer. This is not any worse than OpaquePointer, but it’s objectively worse than the un-opaqued version.

2. Write a C wrapper library that embeds these pointers in single, non-opaque structures with separate types. This allows us to keep the type safety at the cost of verbosity and an additional layer of indirection.

3. Write two different Swift wrappers for each of these versions that expose the same outer types, and transform them internally. Not ideal: the conditional compilation story here isn’t good and distributing this library via Swift PM would be hard.

I’d be really interested in hearing whether there is a solution I’m missing that can be implemented today. If there is *not* such a solution, is there interest in attempting to tackle this problem in Swift more directly? There are plenty of language changes that could be made to solve this solution (e.g. changing OpaquePointer to OpaquePointer<T> but making it impossible to dereference, then treating UnsafePointer<T> as a subclass of OpaquePointer<T>, or making OpaquePointer a protocol implemented by UnsafePointer<T>, or all kinds of other things), but I wanted to hear from the community about suggested approach.

I’d love not to have to manually maintain a C wrapper just to escape Swift’s type system here.

Thanks,

Cory
_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


(Cory Benfield) #4

I don’t *think* so. In the case of OpenSSL the #define you want to use is OPENSSL_VERSION_NUMBER, which is defined in an OpenSSL header file. As far as I know, the Swift compiler does not see #defines from C header files when compiling Swift code (some experimenting suggests that this is still true, though I am not *certain* of that).

That would mean we’d have to pass this as a build flag with -D, which doesn’t really work with distribution via SwiftPM, and also would require that users know that this flag needs to be passed for newer OpenSSLs.

Cory

···

On 24 Oct 2017, at 18:11, Félix Cloutier <felixcloutier@icloud.com> wrote:

Could you do a conditional typealias?


(Cory Benfield) #5

How do people think about this proposed change?

I think keeping type information on OpaquePointer would be extremely useful, and definitely improves some of the sharp edges of that type.

this should solve your problem:

- when you receive a pointer from the C library, store it as OpaquePointer
   let myOpaquePointer: OpaquePointer = OpaquePointer(c_library_create_function())
which now should work regardless of whether you're linking the old or the new version of the C library
- when you pass a pointer to the C library:
   c_library_consuming_function(.init(myOpaquePointer))
which should (modulo it doesn't right now https://bugs.swift.org/browse/SR-6211) select the right initialiser for you, only it doesn’t :astonished:

Yeah, this approach is probably acceptable. I’m going to stop short of calling it “good”, because frankly it remains a pretty unfortunate hack. It would be nicer if there was a way to tell the Swift compiler that the user should never initialise the structure, regardless of whether it’s opaque (that is, to tell the Swift compiler that when it sees SSL_CTX * it should translate that to OpaquePointer<SSL_CTX> instead of UnsafePointer<SSL_CTX>), but I will concede that adding support for that case is likely to be a bit too niche for the community to have much interest.

In the absence of that better version, your proposed interface will get the job done without being too gross. Thanks!

Cory

···

On 24 Oct 2017, at 18:23, Johannes Weiß <johannesweiss@apple.com> wrote:


(Johannes Weiss) #6

@Douglas_Gregor @Joe_Groff @jrose just to show what awful code Cory had to write to just support OpenSSL 1.0 and 1.1 which makes the structs now opaque (forward declared).

This whole PR basically removes type-safety, most annoyingly that bit is now more type-safe in C than it is in Swift because the opaque pointers still have distinct types in C but in Swift they're all just OpaquePointer.


(Jordan Rose) #7

On the type-safety front, we haven't been able to come up with a solution here that wouldn't be massively source-breaking. It can't even be a compiler mode because then you wouldn't be able to mix libraries built differently.


(Douglas Gregor) #8

I think we'd have to make this an opt-in feature specified in the module map for a library whose types will be imported as forward-declared entities. The frustrating part is that we don't know which types are "owned" by that module, because there can be multiple forward declarations across different modules!

Doug


(Cory Benfield) #9

I'm a bit pessimistic about this approach.

One extremely powerful narrative that Swift has is that it is easy to integrate with pre-existing C code. This is true for the most part, but it falls flat on its face with modern opaquified C libraries. If this problem isn't addressed "automatically" by the compiler, it stops being easy or particularly safe to integrate with those libraries, as it now requires the developer to know a) that OpaquePointer is bad for them, and b) that there is a mechanism to resolve this that involves using a modulemap (itself an entity that is not well-known or understood by many developers).

While I'm outlining risks I'll also say that this is a dangerous issue, because we risk being paralysed and letting the situation worsen. We're nervous about the source breakage, so we don't want to do that, but the other options are not really more appealing. This makes us inclined to kick the can down the road, and each time we do that we add more code to the pile of code that will be broken if we make a breaking change.

Sadly I don't have a better solution to propose than the ones already on the table.