Writing a formal semantics for Swift

Alvae · November 7, 2019, 10:08am

Hello fellow Swift enthusiasts,

TL;DR; Is there a formal definition of Swift's semantics somewhere? If not, would it be of any use to you, and in which way? If such formal semantics were to be developed, what essential aspects should it include, as opposed the features that could be abstracted away?

I am a researcher at University of Geneva, working on programming languages. In an effort to use Swift as a basis to study the practicality of new/revisited theoretical concepts, I would love to have a formal definition of Swift, or subset thereof.

Swift has a rather advanced type system and offers a great set of features. I believe this makes the language suitable to explore many ideas related to static verification (e.g. capability-based systems, behavioral types, gradual typing, etc.). But a proper assessment of the soundness of a particular addition would require a formal definition of Swift's type system and operational semantics. To the best of my knowledge, such formal definitions do not exist, as Swift's semantics is expressed in prose only, in the language manual and/or in the various proposals produced by Swift Evolution.

Purely from a researcher's point of view, this hinders the use of Swift as a target to explore theoretical concepts, as correctness can only be evaluated empirically with an actual implementation. Say for instance one was planning on exploring the addition of uniqueness and borrowing a la Rust in Swift. Determining whether or not this would result in a sound type system can't be formally evaluated.

From a user's point of view, I believe that a formal semantics could also ease the understanding of some of Swift's most advanced features. For instance, teaching the use of protocols and conditional conformance can prove challenging, as the rules under which Swift's type solver operates often have to be "discovered" in a kind of trial-and-error loop.

Provided a formal definition of Swift's semantics hasn't indeed be formally defined yet, I plan on writing one. Consequently, I would be interested to know whether such an effort could be useful to the community. Similar projects have been carried out for other programming languages. A noteworthy example is Featherweight Java (FJ) [1]. FJ is a minimal core calculus that aims to capture the essence of Java, precisely for the purpose of exploring the relationship between Java's semantics and other language features. For instance, FJ does not even have assignments, and focus essentially on Java's type system.

My goal is to develop a comparable minimal framework for Swift. Hence, I would be interested to know what you would consider to be the essence of Swift. My opinion is that it should also focus primarily on Swift's type system, with a strong emphasis on the difference between value and reference types.

[1] Igarashi, Atsushi, Benjamin C. Pierce, and Philip Wadler. "Featherweight Java: a minimal core calculus for Java and GJ." ACM Transactions on Programming Languages and Systems (TOPLAS) 23.3 (2001): 396-450.

CTMacUser · November 8, 2019, 8:13pm

I wouldn't mind something like this. A formal description, like the ISO/ANSI C and C++ standards. It could allow third-party implementations, and/or keep us (the producers of the Apple-oriented implementations) honest from accidentally baking in Apple assumptions. We don't want the formal description to include every detail of the current Apple-based implementation, to allow the flexibility to improve parts of the implementation that don't matter from an external perspective.

Alvae · November 9, 2019, 12:08am

While I totally agree with you on the benefits of such documents, my idea of a formal definition would not necessarily be as technical as what something like the C/C++ standards. My understanding is that standards are more intended to be used as references for compiler implementations, rather than reasoning about the language's semantics and possible extensions thereof.

What I suggest would be something much lighter, that offers useful abstractions to reason about Swift's type system and operational semantics in general, hence the idea of extracting the essence of Swift. This also means that it would necessarily abstract over the specific details of Apple-oriented implementations.

I also believe that such an effort could in fact be a useful step to the eventual elaboration of a fully fledged standard, describing the gory details of an actual reference implementation, precisely because it will help us establish a clear and unambiguous description of some of Swift's core features.

typesanitizer · November 9, 2019, 12:23am

Swift has a rather advanced type system and offers a great set of features. I believe this makes the language suitable to explore many ideas related to static verification (e.g. capability-based systems, behavioral types, gradual typing, etc.).

Is there a specific reason for picking Swift in particular? For example, there has already been a bunch of formalization work related to Flow, Pony, Rust, Haskell and OCaml (all of which have advanced type systems), which might help you get off the ground quicker. With Swift, you'd have to lay the groundwork yourself.

For example, is there a reason why Swift is significantly different than Scala from a formalization POV? Off the top of my head, many of the key type system features (broadly speaking) are similar -- classes with inheritance and subtyping, method overloading, traits+implicits vs protocols.

Say for instance one was planning on exploring the addition of uniqueness and borrowing a la Rust in Swift. Determining whether or not this would result in a sound type system can't be formally evaluated.

I think that having an operational semantics for SIL (Swift Intermediate Language) would be a great addition here.

It would help people writing optimization passes.
SIL doesn't change that much compared to Swift, which means there is a better chance that the operational semantics is useful for a longer time without further changes.
Since SIL is smaller than Swift, it already has better coverage of language features in its documentation. So it might be possible to have an operational semantics for a large chunk of SIL, compared to spending the same time to design an operational semantics for a small idealized subset of Swift.

Another avenue that might be worth exploring is designing a formal memory model. @lorentey has been working on atomics, so you might consider discussing this idea with him. He has also authored a new pitch here which is related.

From a user's point of view, I believe that a formal semantics could also ease the understanding of some of Swift's most advanced features. For instance, teaching the use of protocols and conditional conformance can prove challenging, as the rules under which Swift's type solver operates often have to be "discovered" in a kind of trial-and-error loop.

I have a couple of hunches here:

When teaching Swift's advanced features, the lack of formal semantics is not the core issue, or even one of the primary issues.
Most practitioners would find learning from a formal semantics more intimidating compared to learning through examples and exercises.

To be clear, I'm not trying to dissuade you from writing a formal semantics for a (subset of) Swift. All I'm saying is, in my opinion, I think there might be other avenues to explore that might be more beneficial from a compiler implementation perspective.

Alvae · November 9, 2019, 1:21am

Is there a specific reason for picking Swift in particular? For example, there has already been a bunch of formalization work related to Flow, Pony, Rust, Haskell and OCaml (all of which have advanced type systems), which might help you get off the ground quicker. With Swift, you'd have to lay the groundwork yourself.

There is indeed a great body of work in various programming languages, like those that you mentioned, and many others. My impression is that Swift's adoption outside of the academic community is slightly larger (probably with the notable exception of Haskell and Rust in your list), and addresses more industry-oriented applications. The idea of developing/adapting a verification technique to help Swift developers debug/improve their mobile apps is extremely compelling (although quite ambitious).

Swift might be an interesting subject depending on the feature being investigated. I mentioned the addition of reference uniqueness, which would be rather pointless to evaluate in languages like Rust or Pony that already have it. On the other hand, Swift is relatively permissive in its treatment of aliasing, which poses interesting challenges as to identify the minimal requirements to guarantee specific reference properties.

Swift is also one of the rare "modern" programming languages that does not primarily allocate everything onto the heap (with the notable exception of Rust). Consequently, this makes it a good candidate to evaluate features based on statically determinable lifetimes, as opposed to GCed languages based on the JVM (e.g. Scala).

And finally, though that last point is admittedly rather irrelevant, I'm quite fond of the language.

I think that having an operational semantics for SIL (Swift Intermediate Language) would be a great addition here.

It might indeed, specifically for the operational semantics. I am however not familiar enough with SIL to determine whether or not its formalization would be relevant to reason about Swift's type system. Any insights in that regards would be much appreciated.

Another avenue that might be worth exploring is designing a formal memory model. @lorentey has been working on atomics, so you might consider discussing this idea with him. He has also authored a new pitch here which is related.

Thank you for this pointer. I'll look into it.

I have a couple of hunches here:

When teaching Swift's advanced features, the lack of formal semantics is not the core issue, or even one of the primary issues.

Most practitioners would find learning from a formal semantics more intimidating compared to learning through examples and exercises.

I don't completely agree with that position. While seasoned practitioners with prior experience in other languages might get up to speed with examples and exercises rather quickly, my observation is that this process is much harder for students. Often, they lack the hindsight to understand the mechanisms beyond the scope of a single example/exercise, in particular when faced with advanced features, as they tend to learn new concepts syntactically only. While a formal semantics can definitely be scary at the first encounter, my experience in class have lead me to believe that it eventually helps students building a deeper understanding of the mechanisms, less tightly linked with the syntax.

Admittedly, my opinion is completely biased by my academic background and disturbing affection for inference rules.

To be clear, I'm not trying to dissuade you from writing a formal semantics for a (subset of) Swift. All I'm saying is, in my opinion, I think there might be other avenues to explore that might be more beneficial from a compiler implementation perspective.

The purpose of my post was precisely to spark similar discussions and gather this kind of comments. Yours are much appreciated.

jonprescott · November 10, 2019, 10:17pm

For all of these languages, there are coordination papers written proposing new features (Swift evolution pitches and proposals are an example relevant to here), reference proposed implementations made, evaluation by the community solicited, and generally, a consensus is drawn from the stake holders in the community. However, for C and C++, the stakeholder community is much, much larger that for Rust, Go, and Swift. There are multiple compiler vendors, standard library vendors, other library vendors, and robust user communities across a variety of disciplines. C and C++ evolution is more formal because it has to be. There too many competing interests to satisfy. Rust, Go, and Swift will most likely be the same way if they are still around in 40-60 years.

gnuoyd · November 11, 2019, 9:28pm

Interesting project!

Do you think you will build an executable semantics like has been done
for C using the K Framework?

More generally, what do you foresee the semantics looking like? How do
you foresee practitioners using them?

Dave

Alvae · November 11, 2019, 10:39pm

Interesting project!

Thanks!

Do you think you will build an executable semantics like has been done
for C using the K Framework?

Not necessarily, at least not in a first step. If I am not mistaken, the work that has been done with the K framework focuses primarily on operational semantics. I would prefer formalizing the type system first, as I think it would more useful to reason about language extensions.

Nonetheless this would be a very interesting as a step toward a certified compiler, if Swift were to be used one day as an implementation language for critical systems for instance.

More generally, what do you foresee the semantics looking like?

What I have in mind would be pretty close to Featherweight Java. It would be a strict subset of Swift that abstracts over a lot of features to focus on the essential components that characterize its type system and operational semantics. I invite you to have a look at the paper I cited in the original post to get a better idea of what I mean.

How do you foresee practitioners using them?

It depends what "practitioners" mean.

On the theoretical side, the contribution would help designing language extensions and develop/adapt dynamic and static analysis techniques. Assuming Swift's type system can be proved sound (modulo unsafe operations like forced-unwrapping or the use of low level pointer), one would have a tool to check that their extension preserves such a soundness result. To some extent, this could also apply to Swift Evolution proposals, in particular for those aimed at improving Swift's type system (e.g. SE-0143). It is worth mentioning that WebAssembly is integrating formalization in their feature proposals (WebAssembly W3C Process).

While this sounds extremely nice for researchers and (to a lesser extent) SE contributors, what about users and compiler developers? Well, while a formal semantics might not be directly useful to them "as is", the reasoning process necessary to create it could help shedding light on a some dark corners of Swift's semantics. I would for instance argue that the actual rules that describe how methods are dispatched in the presence of extensions are a little obscure. By the mere act of designing a formal semantics, one would be forced to explore these dark corners. The formal semantics (and explanation thereof in prose) could then be used to complete/refine the documentation or even maybe identify incorrect behaviors.

phlegmaticprogrammer · December 17, 2019, 7:49pm

Any progress? I learnt a lot more about Swift's type system recently when coding up a library/framework in Swift, and I find it has a rather subtle and likeable combination of type system features. I would love to have a distilled writeup of the type system. This could maybe also help to fix a few problems with it, like the inability of having abstract classes (and thus classes intended to only serve as abstract classes need to implement the protocols they postulate ...)

Lantua · December 17, 2019, 8:00pm

I don’t know about the rest, but this one have been discussed (and deferred) before in SE-0026. There’re even some discussions afterward.

Maybe you can look into this forum and see if you have anything new to bring to the table.

Alvae · December 17, 2019, 8:31pm

I started working on a calculus reminiscent to Featherweight Java. It is still in an early stage of development, but I have already formalized a good part of Swift's method lookup mechanism.

I'm currently writing a technical report. Hopefully, I'll be able to share a link to a preliminary version by the end of the year.

While this is certainly a long-term goal of my endeavor, I think the first contributions will be to:

have a comprehensive documentation of Swift's type system, and
identify improper behaviors with respect to that documentation.

In particular, I am very interested in identifying features/behaviors that are more a consequence of implementation choices/strategies than the product of a sound type system. I have for instance discovered or rediscovered a few bugs related to protocol inheritance.

phlegmaticprogrammer · December 17, 2019, 8:35pm

Thank you for the pointer, I just read through the discussion and they bring up my problems with missing abstract classes (and superfluous required initializers, just take the superclass one!).

But this discussion is 4 years old now, nothing much seems to have happened in that direction, and it is really annoying when designing an easy to use library / framework. I am working around it, so it is no deal-breaker, but it clearly is a feature that belongs in Swift. Having a compact description of the type system would make it glaringly obvious that this is missing, I think.

phlegmaticprogrammer · December 17, 2019, 8:37pm

That sounds great! Looking forward to the report.

Alvae · December 17, 2019, 8:49pm

While I would certainly welcome abstract classes, I disagree. Having carefully compared how Swift protocols compare with other abstract type mechanisms in other programming languages, my opinion is that use-cases for abstract classes that can't be solved with protocols and extensions thereof are actually quite scarce.

That said, in the interest of avoiding turning this thread in a passionate exchange about the necessity, benefits and downsides of abstract classes, I will be happy to continue this conversation in another dedicated thread.

phlegmaticprogrammer · December 17, 2019, 11:41pm

I jumped into this thread for purposes of abstract classes discussion