Forward Declarations in Swift?

Hi all -- fairly new to posting here so please let me know if additional context / etc. of any sort is needed.

I work on a very large codebase (2M+ lines of code; 50/50 Swift and Obj-C) and at that scale, we have to be very careful with how we shape the dependency structure of modules within the codebase to optimize incremental build times (150+ modules).

One such strategy is the tried-and-true API / Implementation split; we have modules that are strictly "API" modules and contain only protocols and the like. The idea is that the APIs are more stable and change less frequently, whereas the implementation generally changes more frequently. If only the implementation changes, very little code has to be rebuilt other than that implementation because other dependent modules depend on only the corresponding API module.

The tricky part is that we can't make each API module fully independent. For instance, API module A might need to reference a protocol that's defined in API module B. However, if we allow API modules to depend on one another, sooner or later, we'd end up with a web of interdependent module that's equivalent to one "giant" API module, which means that anytime any API module is changed, all API modules need to be rebuilt, which then means close to the entire app has to be rebuilt (since presumably every other module in the app would depend on at least one API module).

The way we solve around this currently is by restricting our API modules so that we can only write them using Objective-C. This is because in Objective-C, we can forward declare types, which means we can avoid having API module A depend on API module B by forward declaring types from API module B in API module A.

So, for example, if I am working in an implementation module X and trying to use an API from API module A that forward declares a type from API module B, I would need to import both API module A and API module B.

This strategy allows good modularization and sane incremental build times. The unfortunate downside is that we can't leverage any Swift-only APIs, beginning with enums with associated types and running up to new frameworks such as Combine / SwiftUI since everything has to able to expressed in Objective-C.

The ask, then, is what are thoughts on adding the ability to leverage forward declarations within Swift to enable an API / Implementation strategy to scale incremental builds for very large codebases? Alternatively, forward declarations doesn't necessarily have to be the solution, though it is the only one that I can think of right now. If there are other solutions or potential evolutions for Swift to solve the same problem, feel free to share!

3 Likes

Cross-import overlays would allow module A to vend certain APIs only when module B is also imported:

However, if the implementation of modules A and B are both interdependent, that may not be sufficient. It is worth asking in that case whether they’re really separate modules then.

Have you looked into using the library stability features so that clients don’t have to be rebuilt with new versions of each library?

The thing with forward declarations in this scenario is that it's a compiler trick to break a dependency cycle, but conceptually the dependency cycle still exists.

3 Likes

As @xwu and @ebg have noted, the fact that the current division of modules requires forward declarations indicates that the modules aren't actually separate. Having worked on similarly huge code bases, I can say that untangling things into really distinct modules, especially while they're all under active development, is a hard job that takes a long time.

That said, I'd also suggest asking the higher level question, is optimizing build time worth the cost of a single developer doing nothing but rearranging the code for a couple of months. That was the case that had to be made within our organization at the time, and when management bought in, the job got done.

if the implementation of modules A and B are both interdependent

This probably depends on how we define "interdependent". Implementation modules A and B should never depend on one another in my case, but it is entirely possible that Implementation module A depends on API modules A and B, and Implementation module B also depends on API modules A and B. Realistically, this is probably more likely indirectly with a chain of API modules. We do still need the separation of API and Implementation.

Have you looked into using the library stability features so that clients don’t have to be rebuilt with new versions of each library?

If I understand correctly, this would require combining the API and implementation into the same module, and essentially, if only the implementation changes, we say that the change is "resilient". This would mean that these modules can depend on one another, which starts to get really hairy. If any change happens that is "non-resilient", the bubbling incremental build would be massive, and the web of dependencies would be very hard to control, so I don't think this is an option. The same problem exists where if public APIs depend on other public APIs, we end up with the equivalent of one giant module, and incremental builds become unreasonable for any non-resilient changes.

the fact that the current division of modules requires forward declarations indicates that the modules aren't actually separate

While I agree from a conceptual point of view, practically, that doesn't apply at this scale. In a codebase with almost 200 devs committing code daily, it's not practical to have a singular bottleneck API layer that would cause the entire app to be rebuilt. Copy-pasting common protocols or classes is also not a maintainable approach. The API has to be split up and we have to use tricks like forward declarations to make incremental builds reasonable. While some progress could be made in untangling some dependencies, the nature of the codebase probably doesn't allow for a full untangling.

In short, the APIs have to be split, are semantically interrelated, but need to not depend on one another for the sake of build performance.