Module Polymorphism?

aetherealtech · July 7, 2023, 1:52pm

Apologies if this or something similar has already been proposed, I searched and couldn’t find anything but the keywords to search for are a bit ambiguous.

I have many times wished that Swift would let me do something that is possible in C-based languages, basically enabled by the fact that declarations and definitions can be separated. I can declare something like a function or class in a header, import it into libraries and use it. That library is then coupled to an interface, but not to any implementation. However, there is no runtime polymorphism. Eventually the symbols will be resolved statically (at link time), and it’s proven that exactly one implementation is included (none = “missing symbols” error, multiple = “duplicate symbols” error).

For example I can declare a log function in a header and include it everywhere, even in libraries that get built externally, then when I build my app, I can pick among multiple implementations of that function, which allows me to swap out a logging system at build time, but still know that one and only one log system is used everywhere (no “configure” or “setup” code at startup is necessary).

You can hack this kind of behavior into Swift by building two statically linked frameworks with identical module names, but the build system doesn’t reliably detect <1 or >1 linked frameworks, and you have to confusingly say that other frameworks link to one of these implementing frameworks, when really they don’t link to a particular implementation, just the static interface that framework implements. Plus this totally doesn’t work with SPM.

I’ve thought about what a proper, modern, Swifty solution to this problem would be, and concluded that it is “Module Polymorphism”.

Basically we define the concept of a “Module Interface” (possibly bad name since that also refers to other unrelated build artifacts), which is like a protocol but on a module level. You declare global functions with no body, global vars with { get } or { get set }, global associated types instead of type aliases, and concrete types (structs, enums, classes, actors) that declare but do not define their members (they’re concrete but their bodies look like protocols).

Then in package manifests you can make one module depend on another module interface, and declare that one module implements a module interface. When an executable in built, it should prove that every required module interface is implemented by exactly one included module. You can vary behavior, but the variation is at build time, which avoids both the performance and semantic noise of dealing with runtime variation that really doesn’t exist but there’s no way today to express that it doesn’t exist.

I can go into more detail about how this could be useful and what exactly it would look like, but first I wanted to see if anyone has had similar ideas, or what people think of this on a high level. Is it a useful feature? Is this the Swifty way to do it?

Lancelotbronner · July 7, 2023, 4:32pm

This is something I’d love for cross-platform development, and is essentially header-only targets (which was requested but did not make it I believe) but for Swift!

I’d love to see this both in C/C++ and Swift!

I believe this is something extremely relevant to the following post and I’d want it for more or less the same reason.

The current workaround (that I’m aware of), is a re-exporting target with conditional dependencies based on platforms needs to know the exhaustive list of packages “conforming” to the header module and outputs linker errors on usage only, which is prone to forgetting symbols.

aetherealtech · July 8, 2023, 8:37pm

Yes, cross-platform is a perfect use case for this!

ibex10 · July 9, 2023, 1:28am

You are not alone.

I too believe that the lack of "This is an interface, this is an implementation of it" is an annoying problem.

John_McCall · July 9, 2023, 4:48am

This is pretty much just resilience. You want a stable binary interface to a module.

aetherealtech · July 9, 2023, 12:28pm

That part actually came after what initially motivated me to think about this. Primarily I just wanted a way to substitute a module in SPM. If only modules I’m building in my own manifest reference a module A, I can just comment out and in package references to achieve this (which still isn't great, what if I want the prod target to use one but the test target to use another?), but if A is also used by third party libraries I’m bringing in, I can’t stop my referencing of those libraries from also bringing in the A they reference, and then I can’t bring in mine without (surely a good thing) triggering a module name clash. Module aliasing goes in the opposite direction I want.

But then in my mind any proposal for SPM to allow saying “replace the A that’s already been brought in with my A” is kinda hacky. It’s allowing symbol errors, which modern build systems are supposed to prevent.

That reminded me of how I’ve always wanted a way to declare a global function in one module but define it in another, but that also felt out-of-place in Swift (another way of setting yourself up for symbol errors later).

So then a modern safe way to do any of this really needs to come with a modern way to define stable module interfaces.

Karl · July 9, 2023, 3:55pm

I don't think this is about the binary interface; IIUC it's about the programmatic interface:

The way I've dealt with this kind of stuff in the past (if memory serves, I don't have such a project to hand) is to create the individual implementation modules, let's say ThingMac and ThingWindows, and then to have an umbrella module Thing which exports the correct implementation:

#if os(macOS)
  @_exported import ThingMac
#elseif os(Windows)
  @_exported import ThingWindows
#else
  #error("Unsupported platform")
#endif

The client just has to write import Thing and they get the correct implementation.

As for ensuring that ThingMac and ThingWindows provide the same public API, you could just keep the expected .swiftinterface file in your repository and ensure that your builds produce exactly the same interface. You'd need to watch for inlinable code, though.

I think there could be a tool which does that for you (and handles inlinable stuff that doesn't matter to developers of client applications/libraries). And, of course, @_exported import is not an official language feature.

It's possible there are missing features which prevent cleanly expressing this module structure in SwiftPM. SE-0273 Package Manager Conditional Target Dependencies seems like it enables this kind of thing, but I think my uses predated that feature, so I haven't tried it.

John_McCall · July 9, 2023, 10:43pm

I thought the request was to do it at link time; that’s a binary interface. Perhaps I misunderstood. You could also do it at compile time, of course.

aetherealtech · July 9, 2023, 11:02pm

You're right, I was proposing link time. That seems the more natural choice, as this should work with precompiled libraries too.

I imagine the module interface would be "compiled" to a binary interface, and this step would catch errors (syntax errors, duplicate or other invalid declarations, functions with bodies, etc.) at the time the interface is created rather than later when it gets used.

dnadoba · July 9, 2023, 11:31pm

swift-crypto already does something similar.
The Crypto module exports CryptoKit on Apple platforms which is part of the OS and is dynamically linked. On other platforms it will build a target with a very similar API shape which is statically linked. So it kind of supports compiled binary interfaces. You can see the details in the Package.swift of swift-crypto.

I don't think that we have a way of enforcing the exact same API for both libraries though. This isn't something swift-crypto could use anyway as we don't want/can offer the exact same API as we exclude any hardware specific APIs on non Apple platforms.

aetherealtech · July 10, 2023, 12:07am

This is cool!

I tried to do this with a manually specified Swift flag:

#if USE_LOGGER_A
  @_exported import LoggerA
#elseif USE_LOGGER_B
  @_exported import LoggerB
#else
  #error("Missing or unknown logger")
#endif

I was hoping I could specify different modules in different targets (i.e. I have two app targets and want to use different log systems in each one), but alas, it appears you have to add the Swift flag to the umbrella module target, you can't add it to a target that uses the umbrella module. So I think this only really works for platform differences, where a built-in flag automatically varies as needed.

ibex10 · July 10, 2023, 2:35am

Actually, this is nearly possible in Swift, but one crucial element is missing.

The missing element is the concept of a protocol function, a term I have coined for a function declaration without a body.

We have a way of specifying the signatures of bound functions under named protocols, but we don't have a way of specifying the signature of a free function to say that it will be adopted to provide an implementation.

For example:

protocol func math () -> Math

adopt protocol func math () -> Math {
   ...
}

To see why this would be useful in Swift, consider this C++ interface specification:

// Math.h

using Z = long;
using N = unsigned long;
using R = double;

struct Math {
    virtual ~Math () {}

    virtual auto square_root (R) -> R = 0;
    virtual auto cube_root (R)   -> R = 0;
};

auto math () -> Math *;

The above specification can be implemented by one or more modules.

Here is one such module, for example:

// Math.cc

#include "Math.h"

namespace {
   struct _Math: Math {
      auto square_root (R) -> R override;
      auto cube_root (R)   -> R override;
   };

   auto _Math::square_root (R x) -> R {
      return R (0);
   }

   auto _Math::cube_root (R x) -> R {
      return R (0);
   }
}

auto math () -> Math * {
   return new _Math ();
}

I can compile the users of the Math module with the interface file Math.h only once; then, I can pick a (compiled) implementation of the Math module and link it with the (compiled) users to produce an executable. I don't need to recompile the users when I want to plugin a different implementation or a new one becomes available as long as the interface does not change.

If the concept of protocol func were available, the same thing in Swift would look like this:

// Math.swift
// Interface package

typealias Z = Int
typealias N = UInt
typealias R = Double

protocol Math {
    func square_root (R) -> R
    func cube_root (R)   -> R
};

protocol func math () -> Math

Just as in C++, there can be several implementations of the above interface.

Here is how one such implementation would look like, if the concept of adopting a protocol function existed:

// Math1.swift
// An implementation package

import Math

private struct Math1: Math {
    func square_root (R) -> R {R (0)}
    func cube_root (R)   -> R {R (0)}
};

adopt protocol func math () -> Math {
   Math1 ()
}

Now, different modules can provide different implementations of the interface. Please note that the interface includes a free function as well, not just a protocol and a bunch of numeric types.

But, here comes the tricky bit:

// User.swift

import Math // not import Math1

let m = math ()

I am not sure how I would compile this and link it with the Math1 module.

aetherealtech · July 10, 2023, 2:38pm

I normally handle variation this way, and just deal with the lack of the selectable global function with some kind of “setup” at startup. However, even if this problem were solved (and note that this would be solved by Module Interfaces, you could define a Module Interface that only declares the protocol and a global function to return an instance), it has some problems.

First is performance. Probably not that important for most of us, but the game engine authors will find the cost of dynamic dispatch there potentially deal-breaking.

Second is semantics. Let’s say the Math protocol doesn’t work with primitives, but instead its own corresponding type that represents, say, real numbers:

protocol Real : ExpressibleByIntegerLiteral, ExpressibleByFloatLiteral {
  init?(_ stringValue: String)

  var doubleValue: Double { get }
}

protocol Math {
   func squareRoot(_ value: Real) -> Real
}

protocol func math() -> Math

struct Real1: Real {
  init?(_ stringValue: String) {
    …
  }

  init(integerLiteral int: Int) {
    …
  }

  init(floatLiteral float: Double) {
    …
  }

  var doubleValue: Double { 
    …
  }
}

struct Math1: Math {
  func squareRoot(_ value: Real) -> Real {
    let value = value as! Real1 // Gross!

    …
  }
}

adopt protocol func math() -> Math {
  Math1()
}

struct Real2: Real {
  init?(_ stringValue: String) {
    …
  }

  init(integerLiteral int: Int) {
    …
  }

  init(floatLiteral float: Double) {
    …
  }

  var doubleValue: Double { 
    …
  }
}

struct Math2: Math {
  func squareRoot(_ value: Real) -> Real {
    let value = value as! Real2 // Gross!

    …
  }
}

adopt protocol func math() -> Math {
  Math2()
}

(We'd also need a factory for Real)

Notice how you have to force downcast the related type to the “matching” concrete implementation. This design does not express that if Math1 exists in the program, the only Real that can exist is Real1. We’re really incorrectly expressing that any Math can work with any Real, but that’s not true.

“Make Real an associatedtype of Math!”

If we did that, we correctly express what each Math can work with, but we lose the ability to work with Math abstractly. The client has to know the concrete Math it’s working with, but we’re trying to write libraries that can work with any Math library.

With module interfaces it would look like this. First the module interface:

struct Real : ExpressibleByIntLiteral, ExpressibleByFloatLiteral {
  init?(_ stringValue: String)

  var doubleValue: Double { get }
}

struct Math {
   func squareRoot(_ value: Real) -> Real
}

func math() -> Math

Then an implementing module:

public struct Real: ExpressibleByIntLiteral, ExpressibleByFloatLiteral {
  public init?(_ stringValue: String) {
    …
  }

  public init(integerLiteral int: Int) {
    …
  }

  public init(floatLiteral float: Double) {
    …
  }

  public var doubleValue: Double { 
    …
  }
}

public struct Math {
  public func squareRoot(_ value: Real) -> Real {
    …
  }
}

public func math() -> Math {
  .init()
}

And another implementing module:

public struct Real: ExpressibleByIntLiteral, ExpressibleByFloatLiteral {
  public init?(_ stringValue: String) {
    …
  }

  public init(integerLiteral int: Int) {
    …
  }

  public init(floatLiteral float: Double) {
    …
  }

  public var doubleValue: Double { 
    …
  }
}

public struct Math {
  public func squareRoot(_ value: Real) -> Real {
    …
  }
}

public func math() -> Math {
  .init()
}

Downcasting is gone. By declaring a particular concrete Real and Math together in a module, we’re expressing that they must come together. There also should be no runtime cost. The calls to Math.squareRoot should be statically wired to the particular Math linked in at link time.

We also could just expose a default init in Math instead of the global factory, or move the factory to be a static func on Math.