Hello fellow developers,
In order to make it easier to collaborate on adding additional platforms, it would be useful to formalise some expectations around platform development strategies to ensure that the work around all the other goals can continue effectively. I would like to request comments on the following attempt to create such a document. This is the effective continuation of the previous thread about how do we enable platform proliferation for Swift and encourage contributors to work on different ports.
I hope that we can get a productive discussion from this and create a strong basis for continued efforts for new targets.
Saleem
Proposal: Policies for Swift Platform Development
Core Principles
In order to create a stable ecosystem for Swift, it is important that we maintain a single coherent ecosystem across platforms. Whenever technical feasible, the project should aim to provide the same interfaces, behaviours, and capabilities on every platform. For example, the compiler and build system should support both static and dynamic linking of libraries on all platforms.
This not only makes it easier for users of the language to be able to target different environments easily, but also the language implementers by reducing the complexity of the implementation through fewer divergent code paths. By sharing the code paths, the implementation can be vetted more thoroughly by multiple toolchains (e.g. MSVC, gcc, clang). This gives a higher reliability in the Swift compiler itself and helps catch issues before users hit them. It also enables additional avenues of approach for isolating and debugging defects when they are found.
We should also exploit temporal locality in our approach to system development. It is easier to resolve issues with the context fresh in mind rather than trying to recover the details after time has passed. As such, it is important to ensure that we carefully time bound regressions in order to address them quickly and efficiently.
Because future features often build upon existing functionality, removal of functionality from a specific target endangers future functionality on that platform. When not actively maintained, code quickly bitrots, and will result in platforms rapidly regressing.
Enabling Collaboration and Incubation of Features
Software engineering is a collaborative process. As such, it is often required that we have features that are not yet fully ready to be merged in order to enable people to collaborate. This requires that we have a maturation process for features to allow developers to quickly make progress.
One approach to enable this is to permit development features being incubated to be made an explicit opt-in, and unsupported option - no driver level control, the feature can only be accessed through the frontend, and is marked as experimental. That is, features can be gated as -Xfrontend -enable-experimental-feature
which restricts and explicitly marks the feature as incomplete. When the feature is formally introduced with driver level options and control, they would be enabled on the major platforms (given that the feature is not inherently platform specific).
Features graduating from experimental to production would be evaluated across the major platform environments (i.e. macOS, Linux, Windows, Android). The new feature work should be enabled across these platforms, with the possibility of exceptions being granted in certain circumstances (e.g. the feature does not make sense on a particular platform).
Platform Support
In order to be able to support multiple platforms, we need to have a means for having platform owners who are the points of contact for developers to reach out to in the case of platform specific issues when they arise. Additionally, although many of the day-to-day issues are easily resolved by builds and tests, there are possibilities of platform specific issues to come up. Having a platform owners group who is explicitly identified to help resolve the issues will ensure that the different platforms are able to help keep the overall project healthy.
Additionally, having a tracking mechanism to identify specific features and each individual test that is disabled for any reason that is time-bound to be fixed, will ensure that the platforms continue to remain in a healthy state. Feature authors should be responsible for working with the platform owners to ensure that features are enabled on other platforms before the next release.
As platform support improves, it would be possible to expand the major platform list. In order to be considered a major platform, the platform should be at a comparable feature set with the other platforms. Under certain circumstances, it may be possible to have exceptions granted to the platform after discussion with the core team and engineering operations team. Failure to maintain the platform’s feature parity may result in the platform being removed from consideration as a major platform with input from the engineering operations team.
We should create a table of functionalities that are deemed as part of the platform compatibility. This would begin as a forward facing document, as enumerating the existing features is deemed too onerous. This document should reside in the Swift repository. Changes by the community to backfill past features are acceptable, however, no guarantees would be made about the completeness of the document for features prior to Swift 6. Issues tracking the implementation of the feature could be linked to JIRA if appropriate.
Feature implementers should be encouraged to reach out to the platform ports owners to ensure that problem areas of integration are addressed early in the design phase to help reduce the problems of the feature causing problems for the ports if they are introduced without consideration for other platforms. This would help ensure that platforms do not regress on functionality as new features are introduced into the project.
Build Improvements
In order to ease the builds with different C++ toolchains, we should enable any additional diagnostics which increase the likelihood of clang catching a compile issue that other compilers may see. This includes things like enabling -Werror=gnu
and using -std=c++14
rather than -std=gnu++14
. Whenever possible, we should enhance the diagnostics in clang to identify the issues that other compilers identify (assuming that they are not actual issues in the other compiler).
In order to make it easier for collaboration in maintaining ports, we should introduce a new document to centralize documentation on how to address common pitfalls when targeting all major platforms. This document would be built up incrementally. Since a large source of these issues tends to be undefined behaviour in the (C++) language, it would be best to enable a UBSAN build of the compiler.
Evolving Testing
Testing software in an automated fashion is critical to ensuring that things do not regress. Oftentimes when tests fail on a platform they are indicating that something is not being handled completely and that we are relying on specific behaviours. Simply marking the test as XFAIL
largely leaves the tests disabled indefinitely. Having test expectations diverging on different platforms means that issues on the platform remain unnoticed. Filing a defect report on the issue does not guarantee that the issue will be resolved (particularly if it is deep within the original implementation and not something that the platform owners are familiar with).
One option to reduce the impact of XFAIL
on a particular platform is to universally mark the test as XFAIL
unless there is a fundamental platform specific behaviour that is being tested (which indicates that the test may be better to mark as UNSUPPORTED
). This would help ensure that platforms are mutually evolving.
It may be beneficial to apply some code coverage metrics to the tests to ensure that platforms are being tested equally. There is some form of this which is not being highlighted currently: lit provides a testing summary, and perhaps we need to highlight the summary more effectively to indicate the quality of testing for a change. Better visibility in the characteristics of the tests being disabled should allow us to make more informed decisions.
Continuous Integration
In order to ensure that developers are able to quickly identify the problems, we need to have the ability to quickly test changes against different platforms. The Windows support has had post-commit testing for nearly a year. Currently, the Windows platform is possible to test in CI with optional opt-in. It would be beneficial to enable this to non-blocking pre-commit to ensure both the stability and scalability of the testing infrastructure. As we gain better confidence in the ability to support development in a pre-commit form, we should consider moving it to a required pre-commit test.
Escalation Policy
We should create a collaborative environment which encourages developers to try to keep all the ports working and progressing together. However, it is important that we have a process in place to escalate issues.
If a specific port fails, the change author should attempt to resolve the issue based on their knowledge and any documentation that is available. If after consulting the documentation and trying to address the issue with their knowledge, they could reach out to the platform owners (note that we do not want this to degrade to “the compile failed on this platform” hand it off to the platform owners, change author should make an effort to resolve the issue).
The platform owners should be given ample time (~24 hours excluding holidays/weekends) to respond to the issue. If the platform owners do not respond or are unable to resolve the issue, it is acceptable to mark test failures as XFAIL and file a release blocker bug to ensure that the issue will be properly addressed.
If possible, it is best to incorporate the changes for the platform into the change itself. However, if the change is too large or completely orthogonal to the change, it may become necessary to file an issue for the platform owner to resolve separately. In such a case, the offending tests would be marked as XFAIL
, an issue filed for the port, and the change could continue to be merged while the port maintainers asynchronously resolve the issue.
Release Management
In order to better track tests that we disable temporarily, when tests are marked as XFAIL
to allow us to merge a change, we should file a JIRA issue to track the disabled test and mark it as a release blocker so that we can ensure that the problem is addressed before the release. The release manager would be responsible for ensuring that the issues are scrubbed before a release, identifying why a disabled test is not enabled again before a release if that should occur.
The release manager would periodically review the state of the ports (identifying tests which have been disabled either because they could not be solved in time or the port maintainers were not available), and ensure that they are brought to the attention of the port maintainers. This would allow them to make recommendations on priorities. Trends should be identified and used to make suggestions for improvements to policies.
Compiler specific workarounds should be and under a macro to identify the workaround and enable us to remove them easily. This would allow us to quickly remove the workarounds when the compiler requirements are bumped.
Footnotes
- CI runs indicate that Linux has ~1389 unsupported tests, Windows has ~1397 unsupported tests, macOS has ~190 unsupported tests. Experiences with the Windows port show that some of these tests are inappropriately marked as unsupported and a set of them were enabled during the Windows port. It could be useful to periodically evaluate the current state of the disabled/unsupported tests.
- The test failures at least for Windows tends to usually be something where the compiler differences require a small change (e.g. a move assignment operator is not synthesized without an explicit request) or path separator issues where the compiler is simply not normalizing the path as it should.
- This document is written with the current platforms of interest being macOS, iOS, Linux, Windows, and Android as major platforms, but should keep things open for other platforms