10 years on, what would you change about Swift?

gregtitus · May 10, 2024, 4:38pm

I don't disagree with your characterization of the problems with this approach, but the programming model IS SO GOOD. Literally, I think the generics are the best part of the language. I wouldn't change a thing about the design here, except for possibly tiny things like @inline default behavior.

Joe_Groff · May 10, 2024, 5:02pm

Lest another aspiring language designer look at Swift and see no hope in the separately-compiled generics model, there are a lot of opportunities for major improvement within Swift's implementation, and a from-scratch implementation could do a lot better informed by some of the tradeoffs we made in our implementation. The essential overhead of an unspecialized generic is not much different from a class with virtual methods—you pass a pointer to the value, along with a vtable containing function pointers to all of the methods for the protocols the generic value is required to implement. Swift introduces overhead on top of that basic model for a number of reasons, including:

We require globally-unique metadata records for every type and protocol conformance, out of a combination of needing to interoperate with Objective-C's similar model for class objects and our own desire to support richer reflection for all Swift types. Since these records need to be globally unique, they require coordination through the runtime to instantiate, which can be expensive, and they need to contain every bit of information a Swift program could conceivably ever ask about the type. In the common case, though, if you're just invoking other protocol methods on the value, you don't need most of that metadata, so defaulting to a less-reflectable model for generics might've been a better choice. The need for globally unique metadata also complicates our ability to pre-instantiate metadata records even when we know statically what generic types will be used, since the runtime needs to be aware of the pre-instantiations in order to register them. It would be worth experimenting whether the overall system memory usage cost of non-unique metadata records (and added overhead of metatype equality and other operations) is worth the savings of not having to have unique records.
We never specialize protocol witness tables for generic types, so as soon as you hit unspecialized generics at any level, you pay for unspecialization at every level—so even if we know you passed a concrete Set<Array<(Int, Int)>> as a Sequence, through the abstract Sequence interface, we operate on a unspecialized Set<T: Hashable>, which in turn deals with an abstract Array<T> through the Hashable abstract interface, which in turn forwards to the abstract (repeat each T) implementation of Hashable. This ties in somewhat with the uniqueness requirement above—it would be better to instantiate witness tables for the specialized instance at point of use.
(Until recently) every Swift type is implicitly copyable, movable, and destroyable, and the compiler implicitly uses these operations. Our initial ARC optimization approach was informed by the ObjC ARC optimizer, but it should've arguably been ownership-based from the start, and although we've since switched to "OSSA" SIL for most types, we can't be aggressive as we'd like to be in some cases because of existing code relying on implicitly-extended lifetimes, and unspecialized generics also still don't get to benefit from OSSA at all. Having a better optimizer, and maybe better user control over where implicit copies are allowed to occur, would help with that overhead.
Representing the core copy/move/destroy operations as open-coded functions with a "value witness table" to dispatch them is also a major code size cost paid for every type, for code that is somewhat fundamentally going to be pretty slow. We've been working on an alternative "value witness bytecode" which represents the type layout abstractly as a string, which is interpreted by the runtime to know where to retain/release pointers and do other copy/move/destroy work; not only is this much smaller, but it's also actually faster in a lot of cases in our experiments.

We still have room within the existing ABI to eventually realize a lot of these gains, but it does take longer having to retrofit them within the existing system. I don't think allowing implicitly unspecialized generics on ABI boundaries was necessarily the wrong default, but there are definitely a lot of things about the implementation we can do better. We should also generally have a more robust cross-module optimization model for source-based projects that don't care about ABI.

Alejandro · May 10, 2024, 5:03pm

Fundamentally, in order for Swift to provide ABI stable generic interfaces, some form of "non-inlined" (or as we call opaque generics) had to have existed; otherwise ABI stable libraries could not ship public generic API without also making its implementation visible. This is the same problem that C++ has. C++ cannot ship opaque generic APIs because their generic compilation model doesn't allow for it.

// MyLibrary.h
template<typename T>
T add(T x, T y);

A client of my ABI stable library MyLibrary including this header could not call add because 1. the compiler does not emit a generic definition of add and 2. the client using add with their own ClientInt64 would not know how to compile the function because the implementation is not available for them.

In order for C++ to have these sorts of generic APIs, their implementations have to be visible always (header only).

// MyLibrary.h
template<typename T>
T add(T x, T y) {
  // OK, now clients know how to specialize
  // this for their own custom types.
  return x + y;
}

Swift solved this by fixing number 1, defining modules have a generic entry point that clients can call into if specialization isn't possible. This lets us have ABI stable generic APIs.

I think the current default is really good for ABI stable libraries (those who have -enable-library-evolution) because you should have to opt into making your implementation visible. There are a lot of consequences for these stable libraries by having implementations visible, like everything within this implementation is now ABI stable as well etc etc.

What I don't understand is (and where I think a lot of frustration comes from Swift folks) the Swift package compilation model. I'll ignore binary packages for now, but typically packages have access to their dependencies' sources. Model boundaries don't really matter here to me here besides requiring an import. In essence, every function's implementation is visible to the client because the package manager likes to statically link everything in a single binary. A generic function who isn't visible to other modules doesn't make a whole lot of sense in this model because it will be compiled right next to the clients code. I believe cross module optimization was supposed to resolve a lot of these issues, but it seems folks in the package world still need to annotate everything with @inlinable. There is still concern about code size because if the compiler saw the implementation of every dependency module then it could go around specializing everything which could be a real concern for some folks.

Gabardone · May 10, 2024, 5:30pm

I have no major issues with the language, it's what pays my bills. But a few things rub me the wrong way so I'm going to list them here:

Functions and blocks aren't first-class citizens of the type system.
Scoping is inconsistent depending on the specific construct you're using @autoclosure is cute but leads to confusion and unexpected behaviors.
Namespacing is half-assed. Can't call the default implementation of a thing, either it exists or it doesn't. Can't pinpoint an extension implementation
Related to the above: we should revisit access controls now that we got a few years of larger scale development under our belts.
Collection operations should be lazy by default whenever possible.
Having two distinct generic systems that don't always (often) play well with each other was… a choice. It's gotten better lately but there's still quite a ways to go and you still hit weird type system edges more often that we'd like.
I'm still doubtful of whether the concurrency approach taken was the better one for what we had.
Lift a page from Kotlin and let parameters be passed to things in any order.

Nobody1707 · May 10, 2024, 5:38pm

They don't. They either don't change ABI, or they perform the C++11 String redesign, and then are so traumatized that they don't change ABI again. Or they do like Rust and just not have a standard ABI other than C's.

Nobody1707 · May 10, 2024, 5:43pm

There's an obvious two-part solution that no one has had the time or resources to implement:

Add a standard alias to the current module, so it doesn't matter if that name collides with anything or if you even know what it is.
Add actual namespaces and a namespace resolution operator (::) so that you can group these things without needing the enum hack, and you can distinguish between types and namespaces.

wadetregaskis · May 10, 2024, 5:52pm

It's more or less a solved problem in many other languages, with things like from xyz import abc. Even C++'s using xyz gets you a long way. I realise those approaches have some less than ideal aspects, but they're better than nothing.

Swift has little pieces of some of these things, but they're largely unknown amongst practitioners. Even I can't call up an example of one off-hand, even though I know they exist.

Name prefixes (e.g. "NS") were one of many conventions in Objective-C that a lot of people found offensive, which was always utterly inexplicable to me (alongside e.g. pathological aversions to square brackets, yet apparently parentheses are just fine?!). I don't think they should have been abandoned out of what seemed like dogmatism when there wasn't actually a better mechanism in place.

(emphasis added by me)

Choosing to freeze the ABI forever (on Apple platforms, but therefore essentially all platforms because of feature parity expectations) was a bold choice - made with good intentions and with some benefits, for sure, but with also some really big downsides. Slowing down Swift's evolution, as @Joe_Groff alludes to, is not even the worst problem. It's only been a few years but I've already lost track of how many fixes and improvements have been unilaterally rejected because they'd be ABI-incompatible. How's Swift going to fare in another ten years, with this restraint?

I'm not sure it was the wrong decision, is the funny thing. It's just… maybe it was a false dichotomy - maybe there's a third option that gives you more of the benefits and fewer downsides.

That's exactly the false prerequisite; specifically, assuming it's always a prerequisite. Not everything ships as a dynamic library. And most stuff that does doesn't need top performance, because it's very high level API anyway where every function call is intrinsically expensive (even "low-level" examples like URLSession and half the stuff in Foundation). So it's fine for them to have unspecialised generics, but I wish it weren't the only [easy] option.

One problem with this choice is that it precludes the [easy] creation of efficient (and therefore fast) basic libraries, like generic collections & algorithms. Which would almost always be shipped as source anyway. For those, maybe by design they should only support specialisation, because it just doesn't make sense to go to the trouble of using e.g. a deque from a 3rd party package, instead of the built-in Array, unless you actually care a bit about performance. So having to pay heavy runtime costs for unspecialised generics (and lack of inlining more broadly) is a terrible trade-off to have to make. It makes many types of generic collections and algorithms untenable.

More simply: I really hate having to literally copy-paste code out of a Swift package into my own module, in order to unlock generics specialisation and achieve reasonable performance. It's ugly on so many levels.

On the upside, I think the notion of a "library evolution mode" as both a distinct mode and an opt-in was rather brilliant. If anything, I think it didn't quite go far enough - or perhaps rather, it needs some complimentary siblings, like "dynamic library mode" (a.k.a. "stable ABI mode"?) as an opt-in which does things like permit emission of unspecialised generics.

Yes, exactly. A lot of these problems would be pragmatically moot if Swift just compiled all sources as one holistic unit, rather than independent modules. Like it compiles modules as one unit, rather than individual files - which was a fantastic improvement over its ancestors. Swift just wasn't ambitious enough, in this context.

Opaque dependencies - e.g. binary dependencies, dynamic libraries, etc - would obviously not partake in this system, but that leaves generics performance as a problem only for a tiny subset of use-cases and users.

I'm heartened, though, by the observation that most of what we're talking about here can still be achieved. e.g. maybe Swift 7 can have a "whole program optimisation" mode, building on the existing "whole module optimisation". Maybe even within the existing constraints the compiler can get materially better, as @Joe_Groff suggests. Here's hoping.

Joe_Groff · May 10, 2024, 5:58pm

Like many things, there's a balance here that ultimately needs to be struck. As you pull more things into the compilation unit, the set of optimization opportunities (both those seen by the optimizer in fact, and those expected by the programmer, IME) tends to grow superlinearly, and furthermore, you lose the ability to rebuild parts of the program independently without rebuilding the entire thing. Rust for instance does build everything in one unit, by necessity, and build performance becomes an increasing problem as projects grow in size. This of course is also a problem for Swift, but the fact you can carve out separate modules to be built separately is an important escape valve to allow projects to scale.

ole · May 10, 2024, 6:00pm

For those interested, @drexin describes the "value witness bytecode" approach in this video: 2023 LLVM Dev Meeting – Compact Value Witnesses in Swift.

taylorswift · May 10, 2024, 6:04pm

in my mind, a big mistake Swift made was spurning using imports. sure, they were something of a C++ ism (although a similar concept exists in other languages like Python), but they had the benefit that people who disliked typing the namespace could simply code locally as if the namespace didn’t exist.

this criticism strikes me as strange, because in my opinion, ABI stability on macOS is the “only” reason why Swift is still relevant ten years after it was created. big corporations like Apple invent and abandon programming languages all the time, without ABI stability on a major platform, this would have been the most likely outcome for the Swift language. in a technical sense, ABI stability might have been costly, but in a strategic sense, it was an investment that Apple made that signaled a long term commitment to the language, and that’s not something to take for granted.

John_McCall · May 10, 2024, 6:04pm

There's nothing inherent in Swift that requires manual annotations like @inlinable to enable effective optimization across source-library boundaries. It's just a limitation of the current compiler that we'd love to remove. We've been focusing on other problems instead because it's at least possible to work around that limitation with those annotations. If folks would like to work on it, they're more than welcome to contribute. I think a thread in Development would be a more appropriate place for that than a catch-all thread in Evolution, though.

allevato · May 10, 2024, 6:05pm

To add to this, Swift already "kind of" does this with C/Objective-C imports, since it by default builds up the implicit module cache. But if you're on a distributed build system where you can't move that cache around, it means that each time a Swift module is compiled, the compiler has to redo the work of parsing and analyzing the entire transitive closure of C/Objective-C headers. Once we switched to explicit C modules, that problem went away and we saw build speedups on the order of 50-90% in many cases.

So it's really hard to imagine how a "build all Swift code as a single unit" model would work well at a scale beyond toy projects.

wadetregaskis · May 10, 2024, 6:06pm

For sure. But the problem is conflating "packages" in a source-control sense (e.g. this code lives in this repo, that code in that repo) with compilation boundaries. There are plenty of places, in most non-trivial programs, where you can partition the compilation without any meaningful performance loss, but they correlate poorly with source control boundaries.

The problem is the ones where you can't - e.g. basic generic collections & algorithms, most things if called frequently enough, etc.

This is in a way just another manifestation of Conway's Law. Like most manifestations of that law, it's not a good thing.

I don't know what the solution is, though. C++ kinda uses headers vs 'source' files to distinguish these boundaries, which kinda works but has its own challenges. But maybe the solution is effectively that simple - e.g. in your package manifest, designate zero or more files as "integrated" which means they're included in the module that uses them as if they'd been actually written into it (from a compilation perspective only; access controls & scoping would remain unaffected).

taylorswift · May 10, 2024, 6:09pm

there really should be a way to have an “optimization unit” that is smaller than the whole project but larger than an individual Swift module. in my projects i often feel that the natural “optimization unit” should span several (5–10?) modules and i’m unnecessarily making things @inlinable to the entire package simply because a single module is too small of an optimization domain.

vns · May 10, 2024, 6:26pm

Comparing Swift to a lot of modern languages that gain popularity, I would say that from the convenience, practicality, and simply aesthetics Swift probably the best. It is distinguishable how much were put into the language design. So it is a pure pleasure to write in it, except one major but — tooling and ecosystem.

Take for example server side Swift — deployment on any major server platform is painful. Just installation Swift on Linux machine might prepare a lot of surprises. When other languages (haven’t tried Rust yet to deploy) offer fast to start and go solutions. And this is a huge benefit to them. Even if you have significantly larger experience in Swift, it is faster to go with almost any other option.

And that spreads beyond. Leave Xcode and you have issues with autocomplete and highlight (haven’t managed to solve for neovim so far). There is also still a bias towards Swift being just Apple platforms language.

It would be delight to write in Swift majority of code for me, I have tried — and apart from Apple SDKs it is too high price for me compared to other options, there are simply not enough time to cover existing gaps or trying figure out how to make it work in certain circumstances. I haven’t experienced major performance issues, yet haven’t been working on something that has required significant performance optimizations. Obviously, later additions for structs simplified a lot of optimization points, and this growth of the language — despite increased complexity, which I believe is inevitable — is great. But so far the lack of tooling and infrastructure comparable to other languages in my perspective is the more huge downside than some not perfect features — developers much faster would get used to that if rest of the issues has been covered.

Slava_Pestov · May 10, 2024, 6:42pm

The tradeoff there is that without source and binary stability, many fixes and improvements that did happen would not even have been identified in the first place. One cannot continue to tweak the fundamentals of a programming language forever, because then the only users who remain will be the tweakers themselves.

Nevin · May 10, 2024, 6:47pm

There is a third option, which was brought up by Chris Lattner and discussed in this thread. (As per the grand tradition of these forums, the thread is about a completely different topic, and the relevant discussion is intermingled haphazardly.)

Essentially, instead of making the whole standard library ABI stable all at once, he suggested taking an incremental approach. There could have been both an ABI-unstable “baked into each app” standard library like there used to be, and also a new ABI-stable dynamically-linked standard library distributed with the OS like we have now.

At first everything would live in the unstable library as it had been. Then, slowly, over the course of time, when each individual piece of the standard library became fully optimized and finalized, it could be moved into the stable library.

Joe_Groff · May 10, 2024, 8:07pm

Even if a type isn't formally ABI stable, it becomes more difficult to evolve the wider its adoption becomes, which is a problem we see within the package ecosystem even without ABI stability. If two dylibs have different unstable notions of what a fundamental type like Dictionary looks like, then those two dylibs also can't be intermingled together.

adamkuipers · May 10, 2024, 8:32pm

I think overloading was a mistake, though perhaps unavoidable. With argument labels, generics, and default parameter values many of its use cases go away. Many of the ways they’re used are for APIs with the same spelling but different semantics and may be improved if they weren’t homonyms.

long compilation times is frequently blamed on overloading, Xcode jump to symbol gets confused half the time, and it’s hard to look up documentation of a function in a pull request since it could be a custom overload.

Operators relies on overloading, but maybe something like Rust’s model would’ve been better. As was the case for Scala, as the community matured, custom operators became less popular due to their shortfalls.

I will say I disagree with most of the complaints in this thread, and the language designers have done a great job with Swift; so maybe I’m wrong as well

johnno1962 · May 10, 2024, 8:51pm

I guess I should have linked to Archive for “Swift regrets” // -dealloc in the OP. It's an interesting reflection on the compromises and considerations that went into designing the language though my original intention with the thread was to flush out new ideas rather than a meditation on things that are "wrong". One of the conclusions so far seems to be the LLDB team really could be better resourced.