Building resilience ... on Windows

Looking to get Windows to parity with mac OS (it feels like it has long since surpassed the state on Linux), the resilience compilation mode becomes a bit of a problem.

When I originally brought up the idea of trying to do something with weak linking on Windows, the suggestion from @jrose and @John_McCall was to simply disable it for the time being on PE/COFF. The problem is that PE/COFF does not provide weak linking semantics - everything must be resolved fully at link time, anything missing is a fatal error.

Since that time, the resilience path seems to be injecting weakly linked symbols. This doesn't really work on Windows. So, this raises the question of what to do there?

Secondary to that is the fact that I am trying to get Windows to pass the validation test suite as well and there are a number of tests which fail to build due to the use of @_weakLink attribute expectations.

CC: @Mike_Ash @John_McCall

This is not exactly weak linking, but maybe you could fake it with some runtime code:

let lib = LoadLibrary("something.dll")!
let address = GetProcAddress(lib, "myFunction")!

(Those should be globals, obviously, so you only load the library once.)

This would have to be more than just code in the runtime, this would need to be open coded into each module by the compiler for anything which is weakly linked. It certainly is possible.

Exactly. I meant "run time" by opposition to "link time". Not as a reference to the language runtime.

This might run afoul of the Windows Store rules though: Why not use weak linking to solve the retargetable library problem?

Yeah, that is the problem: weak linking is not supported period. The only option is to use LoadLibrary and GetProcAddress. We already have a few things which run afoul of the Store rules though (like the stack unwinding).

Is there a Windows Store-blessed way of handling conditionally-available APIs then?

The Raymond Chen blog @michelf linked makes it sounds like GetProcAddress is only disallowed for system APIs, but that you can use it with packaged dlls that you load via LoadPackagedLibrary. It seems like generating a small thunk to do the runtime lookup would be feasible for resilient Swift libraries then, since there's unlikely to be Swift libraries in the UWP itself anytime soon.

I also still think it'd be acceptable to build like we were going to use GetProcAddress for hand-crafted weak-linking, but not actually do it since we don't have a stable Swift ABI on Windows. It's not like anyone's using the resilience support on Linux right now.

(That said, we're also leaning on it for binary compatibility between libraries, so if we ever have a binary distribution story for either Linux or Windows it'll get interesting.)

I don't know why resilience would be reliant on conditionally-available symbols; that might just be some sort of linker magic workaround that we don't need on PE/COFF. It'd be helpful to know what's being weakly imported and (as best as you can tell) why.

Availability checking is pretty reliant on conditionally-available symbols, and if PE/COFF doesn't support them directly, our best bet is probably to use an IR pass to implement them with a global constructor.

1 Like

@John_McCall - yeah, PE/COFF doesn't support conditionally available symbols. Hmm, would you mind describing what you had in mind for the IR pass? I'm happy to implement that.

As to the resilience - that uses weak linking for the interfaces it seems. I ran into that set from the validation test suite.

We should only absolutely need weak linking for linking to conditionally available APIs. Like John said, it shouldn't be fundamentally necessary for resilience.

I was thinking something that would rewrite all references to weak symbols in the program:

  • references from code would be lowered into loads from a shared global initialized with the symbol; now all references are from data
  • references from data would be replaced with null, but the pass would remember the address of the data
  • the pass would emit all the remembered (symbol name, address) pairs into a section
  • some constructor would resolve the symbols dynamically

But if we can defer that until we have to worry about availability, that would be great.

I think that we really should start thinking and implementing this stuff soon.

For one, the Windows API story is starting to evolve pretty quickly (e.g. Windows 1809 and 1903 have both introduced new APIs which are useful to developers). Windows is slowly moving to a model which feels more like a rolling release (like Linux and macOS) as they are not versioning major releases as new Windows releases.

Second, I think that Windows support has actually matured significantly from the compiler and standard library side. Although the Foundation side is not yet complete, its getting close and seems within reach of getting even the test suite running (well, it already runs, there are a handful of tests which are failing for the same reason, and those need to be investigated).

Separation of the runtime from the compiler would really aid in the development at this point since things have stabilized now. The CI seems to be running well and catching regressions, and has decent coverage.

In terms of test coverage, the one area that is left is the validation test suite which I am looking into.

But, I am not sure what other low hanging fruit there is. Adding extensions would certainly be useful, but I think that will evolve more naturally.

What do you think would be a good point to move such a pass to?

1 Like

It’s definitely an as-late-as-possible sort of pass. If it could be done in the linker, that would be best. :)

Before we go too far down this road, how does MS recommend writing code that wants to use such-and-such API if it’s running on a new enough OS?

EDIT: okay, I read the Raymond Chen post; I understand his point, but it’s definitely unfortunate for us (and any other tools that are capable of verifying that weak-linked symbols aren’t actually used at runtime when unavailable). But it does sound like this constructor is the right way to go, I guess?

Oh, sorry, I thought by as-late-as-possible, you mean time wise not during the pipeline.

Yes, I agree, that this should be a late stage pass, perhaps a late SIL stage pass right before IRGen (dare I suggest CodeGenPrepare? :smiley: and then cry as we try to remove it for the rest of time).

I agree that it is unfortunate for Swift. I don't think that they really do the back deployment, instead opting for a local distribution of the runtime (hence the whole dll hell problem that was rampant in the 90s). They have since suffered the problems of that design and the new solution is SxS (side-by-side) installations. The system has a SxS cache of the various versions, and you link against the specific recision of the API that you built against. This allows the system to back-deploy by building a new version of the DLL for the target os.

There is something which they call API contracts which is part of the UWP world, but those apply to the high level APIs more around the Windows UI frameworks and things like bluetooth and GPS APIs. For the packaged DLLs (which are user provided), there is the LoadPackagedLibrary which can get you the handle the module to use with GetProcAddress. So, we do have a way to deal with that when the time comes.

Well, we don’t actually want to fight the platform’s direction on this too hard because it’ll create extra tension for people trying to integrate Swift code into an existing project, which of course is a goal. If SxS is the right way to do this on Windows, we should consider what that means from the implementation on up to the language design. That’s why I think it might be useful to try to, in the short term, narrowly tackle the problem of these weak-linked symbols that maybe don’t need to be weak-linked.

1 Like

To me that feels a bit overkill. It seems to me like we could do an adequate job in IRGen generating lazy GetProcAddress stubs, and that would be easier to work with in the short term.

I agree with your point about first fixing places we're using weak imports unintentionally though; even with resilience enabled, there should be no reason for us to do so on non-Darwin platforms today.

Unfortunately, I think in full generality we need to reference data symbols, and moreover reference them from data, which means there's really no way to stub them. Remember this isn't just Swift emitting references to OS code, it's Swift emitting references to other Swift code across a stable binary interface.

I guess I was taking for granted that we know enough during IRGen to know that we're referencing data that might be weakly referenced so we could go through an accessor if needed instead.

I'm pretty sure there are data-to-data dependencies in the ABI, e.g. with resilient protocols and overrides.