Stability of Swift on Windows?

+1

I think the best thing the rest of us can do to help Swift on Windows, is to ensure that libraries are available for people who want to use it, and that we make a best effort to ensure everything works.

That's why I suggest using continue-on-error for the Windows build if it's being flaky. You'll see if those builds fail and can investigate the failure to decide whether you've actually broken support for that platform, but flaky tasks won't block other CI tasks from executing, so it won't get in the way of other platforms.

Obviously it is less than ideal, but it's better than dropping support and saying you don't even want to know about those build failures.

3 Likes

Absolutely, having libraries that are usable on Windows would be a huge boon, and a requirement almost in my mind.

Personally also like to request that people also file issues with as much information as possible (which @dabrahams actually did!).

In order to improve the state of Windows, we need to get more systematic about the polish, and to do that, it really does help to have a concrete list of items so that we can classify the types of problems and work through them.

As a concrete example, we are finally in the final stages of removal of the alterations to Visual Studio. This means that repairing of the Swift toolchain after updates to Visual Studio would no longer be needed. Furthermore, we no longer have ordering dependencies either. This work required a lot of threading of information for the build itself and has force a few more changes to actually take place. The problem is that staging these changes often is a time consuming process and needs to be done carefully and so it has taken a while before we could address it.

Identifying the top pain points is important helps focus on the items that would improve things the most. I know that the debugging story is still painful, and the stability for concurrency is a problem, which is currently a priority item. The other piece that still remains an issue and is a priority is some amount of work to help improve SPM based builds. Once the current set of build regressions are resolved, I am hopeful that the other pain points can start being ameliorated.

19 Likes

Thanks for your reply, Doug!

That's consistent with the fact that @compnerd can't reproduce the problem on his own machine.

If you're making heavy use of async.

We're not using async at all, unless something in the Windows implementations of the runtime or Foundation is using it. Our codebase is 100% supposedly-portable non-async Swift code, with nothing of significance in an #if os(...) block AFAICT.

In the issues you linked, I don't see much investigation into what's actually happening. Has anyone managed to catch the crash so we can see what's going on?

If by "catch," you mean, "observe on a local machine,” then unfortunately not. It doesn't seem to be just one thing; sometimes it crashes during the build phase, sometimes during the test phase. That might suggest that it's an SPM issue(?)

Platform support is hard, especially for Windows because it's so different from Unix-like OS's, and the environment can matter in surprising ways.

Sure, and no shade thrown on those trying. I'm calling attention to the issue because it would be sad if Swift got a reputation of not-really-supporting Windows because of how things end up playing out in the very common GitHub CI scenario.

I don't think so. Do you have any reason to believe that might be the key?

I have not. Do you have any reason to believe that would make a difference?

We're using the version recommended by @compnerd, who wrote our Windows CI actions. I figured if the release version was a better bet, he'd have used it. As for continue-on-error, Windows failures have not prevented our tests from completing on other platforms.

We very much appreciate your efforts! That said, I don't think that change can possibly address the problem. In all the cases we're concerned about, the job completes OK if you just re-run it (enough times).

an issue that I don't have any idea of how to workaround is the lack of access to the host

The only idea that occurs to me (and I'm just guessing here) is that maybe you could somehow use a virtualized windows, to which you could have complete access, on the (probably already virtualized) host. I'm sure that's nontrivial, if it's even possible though.

Sorry, I didn't mean to imply that it would solve the issue, but more that it may help us understand what the failure is. I think that the struggle with the GHA builder has been so far gaining an understanding of the failure. Were these local, we would have minidumps (akin to coredumps on Unix), which would allow us to inspect what occurred so that we may address the issue.

I intend to spend some time thinking about how to collect telemetry so that we can better analyze and repair issues that we encounter as Swift starts to gain broader usage on Windows.

i am using async. the very few things i am working on that do not use async still use swift-atomics, and i have not been able to get either of those two things working on Windows.

(for those keeping score, swift-nio depends on swift-atomics, so no atomics means no networking either!)

i don’t mean to distract from the very valuable efforts to get Windows CI working for swift projects. in particular Fix JSONMessageStreamingParser error message formatting by tristanlabelle · Pull Request #398 · apple/swift-tools-support-core · GitHub is very encouraging to me as i have seen that exact CI failure many times. i hope that PR gets merged soon.

i just mean this as a reminder that there needs to be proportional effort from the swift project leadership towards supporting concurrency and atomics on Windows, because fixing the CI problems will have limited impact until concurrency and atomics become available as well.

Are you suggesting it's likely that it's trying to emit a diagnostic in these flaky cases—even though the code itself shouldn't generate one—and then crashing? Consider again that the crashes sometimes show up during testing.

I wonder if we'd get reliability by forcing single-threaded operation? That might be worth an experiment, because it would probably indicate a race condition in the implementation of something used by SPM.

I do not have any own experience with it, but you can find statement like “After switching from PowerShell to bash … we have not had any of the random failures on Windows we were seeing before”.

Edit: Note that the default shell for GitHub Windows builds is PowerShell, maybe this is why nobody can see a problem when using it on a local machine (using cmd)? Maybe changing to cmd is enough to resolve the random crashes?

Interesting. @compnerd, maybe you should try that too.

I don't find that obnoxious, it makes sense for them not to support platforms they don't use. What I find obnoxious is when somebody submits a pull with those small tweaks and they don't respond (obviously, larger tweaks are a different matter and are completely up to them to merge or reject).

I think you mean "windows CI for an OS we ourselves do not use," as you would be using the CI. I don't think anybody is expecting it either, it was a suggestion to try that and see if it made a difference.

Since you later note that this is likely a race condition, more cores are likely to lead to less contention and a lower failure rate. Do you disagree? It is worth trying to see if it makes much of a difference, after which you can decide if it's worth maintaining.

right now, supporting "deploy-only" platforms like Android is challenging because SPM cannot distinguish between host platform and target platform. this means we cannot conditionally include dependencies based on target platform. i imagine that may be a factor in why your PR has not been accepted yet.

1 Like

I want to refine this a bit. SPM absolutely distinguishes between host and target platforms; it’s just that all of the built-in language controls (correctly!) inspect the host and there’s no equivalent for targets.

2 Likes

what would it take to conditionally build Package.swift manifests based on target platform?

We're getting off the original thread here, but

  • SwiftPM could define (-D) config variables based on target. Not perfect since these are just on/off things, but has the advantage of being testable now.

  • The compiler and SwiftPM could collaborate to offer #if targetOs(…) or similar. (Changing the current meaning of #if os(…) could break packages that conditionally import Glibc or whatever.)

  • SwiftPM could allow multiple Package.swift files for different targets, like it does for different tools versions. I don't think this is a good idea myself, especially without support for factoring out common logic, but it's a possibility.

  • SwiftPM could continue adding conditionals for each thing that needs to be conditionalized on a per-target basis, as part of the SwiftPM DSL rather than at the compiler level.

Any of these would need a proposal.

2 Likes

this sounds like by far the simplest solution from the SPM side. i have also heard of people doing this manually, so it would be an easier adoption than the other three ideas.

do you have any advice for the @Finagolfin and/or the maintainers of Embassy? right now, supporting Android is a very unattractive choice for library projects because it prevents the library from using any dependencies that do not themselves support Android. and this has a cascading effect across the package ecosystem, because most packages depend on other packages.

I don't see any challenge, as linux is often supported and Android usually just drops into existing linux support, and SPM clearly knows the difference between host and target, as I cross-compile using SPM all the time. You may be refering to package manifests alone, but taking the example of SPM's own package manifest, you simply use conditional compilation, ie #if os(Linux), for the host and SPM target platform conditions for the target.

I don't know specifically about the conditional package dependencies feature, but SPM not distinguishing between host and target is not the reason.

My pull adds no package dependencies and that Swift package is particular about not having any dependencies already, so that is not the reason.

I have not found this to be the case for Android so far, because it is so similar to linux, but maybe you've tried building for Android with more Swift packages than me, or perhaps you are extrapolating from your experience with more dissimilar platforms like PS4 or something.

Swift Atomics support is a known issue that can be fixed by WiX: Add missing `\usr\lib\swift\clang` by stevapple · Pull Request #144 · apple/swift-installer-scripts · GitHub — which was submitted 6 months ago with effectively no response.

This is the story I’d like to tell about improving Windows support so far. Low responsiveness is the main driver for me to switch to more productive areas other than Swift on Windows.

3 Likes

when(platforms:) requires tools version 5.7+. on the aggregate i am seeing many package authors hike toolchain requirements to 5.7 recently, but i believe this is largely due to fallout from SE-0346 than a sudden disinterest in toolchain backcompat.

in theory it is possible to vend multiple manifests targeting different toolchain versions, but there is still no way for multiple package manifests to share definitions (as the manifest cannot have dependencies of its own.) so effective tools version is always much lower than the last minor toolchain release. (it looks like Embassy's effective tools version is 4.2.)

i personally am already hiking requirements across the board and would probably use when(platforms:) to gate Android-incompatible dependencies. but i have never really bothered to support more than 3 minor versions going backwards in the first place, and 4.2 to 5.7 is a much larger gap.

i don't see any indication that the package is particular about not having dependencies, only that it does not have any right now. it's perfectly possible that it would need to add one in the future, and that doing so could imply rolling back Android support.

Understandable frustation. On the other hand, I see quite a lot of pull requests for Windows being approved recently, and generally I regard it as a good sign that there seem to be various people being able to submit those pull requests for Windows. And even the Apple folks now seem to be a bit committed to the Windows port (see the comments above) – besides the members of the Apple Swift team being of course nice and helpful people, maybe this reflects the view that using a programming language on the Apple platform that is "accepted" as a "generally available" programming language by the broad programming community is in Apple's interest? Even in the GUI space, besides SwiftUI you have people working on GTK+ bindings and The Browser Company will hopefully publish something for WinUI 3 (let's see how similar those formulations will be).

I think this is an important year for Swift with all those recent changes like improvements for generics and also with the upcoming new Foundation implementation, and (without playing down how good Swift was already) I think Swift 6 will be my personal "Swift 1.0" (maybe you understand what I mean). So maybe... stay tuned.

2 Likes

I don't think that this is relevant, from the same thread ...

The reason builds fails is either CMake or another tool invoked by CMake writes to stderr. PowerShell triggers terminating exception then.

This is in reference to their script. Adding in their suggestion of:

$ErrorActionPreference = "Continue"

Is certainly possible, but I don't think that will help. The problem is not that the action terminates but rather because the frontend (or driver?) is silently terminating.