Featherweight Swift (Formal Core Language)

Hello everyone!

About a year ago, I started a conversation on swift-users about the prospect of writing a formal semantics for Swift. I promised a technical report that unfortunately never came (my apologies). Today I'm happy to announce that this idea eventually materialized in a scientific project called Featherweight Swift (FS).

As its name suggests, FS is a tiny subset of Swift that gives up on a lot of features to focus on the essence of Swift's type system. It results in a purely functional language, reminiscent of a typed lambda calculus. The main contribution of this work so far has been to provide a formal framework to reason about Swift's method lookup formalism, which has been useful not only as an explanation device, but also to discover a handful of bugs in Swift compiler's implementation. The whole thing resulted in a scientific paper that will be published at the ACM SIGPLAN International Conference on Software Language Engineering (SLE). The talk is hosted on Youtube, and the paper itself can be read here.

Near future works will be most likely aimed at bringing more features from actual Swift into the formal calculus. Obvious candidates include bounded polymorphism (a.k.a. type constraints on generic types) and side effects (i.e., assignments). As a more long-term objective, it would be very interesting to explore how FS can be used to formulate soundness properties on Swift's ownership model.

We are looking forward to any sort of feedback from the Swift community, and eager to work with anyone who would feel like contributing to this project, in any shape or form.

40 Likes

While I'm very happy to see scientific papers being written about Swift, it seems to me that by leaving out mutation, this analysis misses the most important thing distinguishing Swift's type system from that of most other languages: safe, efficient, part-wise mutation without unpredictable side-effects, otherwise known as (strict mutable) value semantics. That, IMO, is the essence of Swift's type system.

If you're interested in expanding your analysis to include value semantics, I know a few people who might be very interested in working with you. :wink:

7 Likes

Hi Dave, thanks for your message!

Leaving out assignment was a tough call, specifically (as you said) because of the traitement of value-semantics. Part of the reasons we did was to keep the paper relatively concise and centered around "classical" type notions.

As mentioned here and in the paper's conclusion, formalizing assignment semantics is definitely a goal for near future works. I would be more than excited to work on this particular aspect with anyone interested!

1 Like

This is super interesting, thanks for sharing!

By the way, it appears that the soundness bug with the forced cast to Self was fixed at some point:

protocol Boxable {
  func boxed() -> Self
}

class List: Boxable {
  let next: List?
  func boxed() -> Self {
    List(next: self) as! Self
  }

  init(next: List?) { self.next = next }
}

class DerivedList: List {
  let foo: String

  init(foo: String, next: List?) {
    self.foo = foo
    super.init(next: next)
  }
}

print(DerivedList(foo: "bar", next: nil) .boxed().foo)

Fails at runtime with:

Could not cast value of type 'l.List' (0x107f532f8) to 'l.DerivedList' (0x107f53398).
fish: './l' terminated by signal SIGABRT (Abort)
2 Likes

Yes, it has been solved since 5.2 I believe: [SR-11818] Bad downcast to Self not caught at runtime ¡ Issue #54229 ¡ apple/swift ¡ GitHub

1 Like

I think treating value semantics goes well beyond assignment. It depends—at least—on dealing with inout, the law of exclusivity, pass-by-value, and initialization.

One of the problems with “classical type notions”—if we take that to mean the formal mathematical language used to describe type systems in research papers—IIUC, is that they build reference semantics into the core representations of things like arrays, and force us to:

  • invent a notion of “copying a value”
  • inject copies into ordinary-looking code in order to express semantics according to the formal reference-semantic language

I think that's likely to appear inelegant and hacky to the language purists that read these papers, and to make a compelling argument I fear it might be important to create a formalism that represents types with value semantics more directly.

What do you man by this?
Are you referring to purely syntactic representations of operational semantics?
Or perhaps you are hinting at researchers' pesky tendency to just assume the universality of Java/Python-like reference semantics :smirk:?

It is true that these approaches are very appealing to semanticians, but I would argue the vast literature on substructural type systems, effect systems, capabilities, permissions, ownership and regions has probably a lot of insights to offer for an elegant formalization of Swift's value semantics. While this line of work is less "classical", in the sense that it is not as a trivial as call-by-value lambda calculus extended with a Java-like memory store, it is now relatively well-established, at least in the PL community.

My knowledge of Swift's precise value semantics is most certainly lacking (hence my eagerness to work with people who are most likely far more competent than me!) Thus, at this point, it is a little difficult for me to outline a clear picture of how I believe the complete formalization should/could look like. To pick an example from the features you mentioned, my intuition is that something like the law of exclusivity could probably be explained elegantly with linear/affine properties, either at the level of a term type or (I would even prefer) at the level of the typing environment.

I am currently working on a type capability system, heavily inspired by Alias Types [1] and more recently the work on Mezzo [2]. The core idea of this kind of approaches is to give up on the usual type invariance of memory locations (a.k.a. storage references to borrow from the formulation given in the ownership manifesto), so that we may encode flow-sensitive properties and/or constraints more naturally. This echoes recent developments on Rust [3, 4] that leverage ownership types [5] in concert with separation logic [6]. In the context of this sort of framework, a (very) informal definition of the law of exclusivity could be, AFAIU, to associate a (writeable) access to a storage as a linear capability that has to be temporarily consumed by an inout argument. Although it is a little outdated, there is a very nice presentation from François Pottier about these ideas [7].

There is obviously a lot of work ahead if one's wish is to formally define everything, in such a way that it results in a framework that can be useful not only to the scientific community but also Swift's users and developers. Hence, I believe that one good way to approach the problem is to adopt an iterative process, extending a minimal formal language with one single feature at the time. This is certainly what I wished to start with Featherweight Swift.


[1] Frederick Smith, David Walker, J. Gregory Morrisett: Alias Types. ESOP 2000: 366-381
[2] Thibaut Balabonski, François Pottier, Jonathan Protzenko: The Design and Formalization of Mezzo, a Permission-Based Programming Language. ACM Trans. Program. Lang. Syst. 38(4): 14:1-14:94 (2016)
[3] Sergio Benitez: Short Paper: Rusty Types for Solid Safety. PLAS@CCS 2016: 69-75
[4] Aaron Weiss, Daniel Patterson, Nicholas D. Matsakis, Amal Ahmed: Oxide: The Essence of Rust. CoRR abs/1903.00982 (2019)
[5] Dave Clarke, Johan Östlund, Ilya Sergey, Tobias Wrigstad: Ownership Types: A Survey. Aliasing in Object-Oriented Programming 2013: 15-58
[6] John C. Reynolds: Separation Logic: A Logic for Shared Mutable Data Structures. LICS 2002: 55-74
[7] François Pottier: Wandering Through Linear Types, Capabilities and Regions. (2007)

I should probably defer to my colleagues on this as I may not know of what I speak. All I can tell you is that my group at Google was working on a paper about value semantics for PLDI and that was how @shabalin described the formalisms one uses for that conference—if I'm not mistaken.

Thanks for this list of references; I think a number of them are new to us. I'll see what I can learn/understand.

I wouldn't be at all surprised to learn that a formalization of Rust's type system could serve as a basis for formalizing value semantics, by the way. However, I think it also might introduce complexity needed only to support that part of Rust that manages the lifetimes of references to ensure that they are safe.

2 Likes

Just skimming this one so far, it appears to be aimed very squarely at proving safety in reference-semantic systems. For example, pointer aliasing is not a concern when working with types having value semantics. While it's possible to implement a type that has value semantics and uses aliased pointers in its implementation (and a system like this might be useful in proving the safety of such a type), the nature of value semantics is that outside the abstraction barrier of such a type, there is never observable aliasing, and types with value-semantics can be arbitrarily composed without disturbing that property.

I've suggested a need to represent value semantics “more directly” in the formalism precisely because the complexities of aliasing disappear once you are in that world, and it should be possible to reason much more simply about the results.

This unfortunate observation echoes my smirky remark about the bias toward reference-based systems in research. Nonetheless, I still think we may reuse the core idea behind Alias Types by abstracting over the notion of references. If one think of a reference as any handle (for a lack of a better term) to some memory location (or storage), then one can use capabilities and other related techniques.

One key insight might be to think of the way one manipulates data in LLVM IR (which would be similar in SIL, I presume, but I'm more familiar with the former). Say you compile this code into IR:

var x = 42

to get something like the following:

%x = alloca i64
store i64 42, i64* %x

Now, we can think of x as a reference. It does not matter whether it has a reference semantics or a value semantics. Either way, it is the address of some memory location, so a "reference-based" reasoning applies.

In Alias Types, one could type x with ptr(p), i.e., the singleton type of some memory location p. The alloca would generate the capability {p -> junk}, which roughly reads as "I know of a location p at which there's some garbage (i.e., uninitialized) value". This capability would be consumed by the store to result in a new capability {p -> i64}. This is what the authors call a strong update.

Now let's say I add this second declaration:

var y = x

which translates to the following IR:

%y = alloca i64
%0 = load i64, i64* %x
store i64 %0, i64* %y

The load would require the capacity {p -> i64} so that %0 can be typed i64 (not a location, an actual value!) As a result, the store can now update {q -> junk} (assuming y: ptr(q)) to {q -> i64} and we have successfully duplicated x's value.

What if we had reference semantics? Well, then we would have the following (simplified) IR:

%z = alloca %SomeClassType*
%1 = call @SomeClassInitializer()
store %SomeClassType* %1, %SomeClassType** %z

Therefore we have %z: ptr(r) and %1: ptr(s), and the environment gets the capabilities {r -> ptr(s)} and {s -> SomeClassType} after the store. Now if we assign z to a variable w, we repeat all our dance for the pointer's value, and in the end we've successfully copied it, effectively duplicating the reference, without even touching {s -> SomeClassType}.

Back to the law of exclusivity, let's say I have the following function:

func f(a: inout Int, b: inout Int) {
  // ...
}

The formal type of the function would be something like ∀[a, b, {a -> i64}, {b -> i64}](ptr(a), ptr(b)) -> Void. This roughly reads as "for all locations a and b such that a and b hold int values, a function type that accepts the locations a and b and returns nothing". This means that to call f you need to provide two capabilities, namely {a -> i64} and {b -> i64} (for some locations a and b). But {p -> i64} (the capability for x's storage) occurs only once in the environment. That means, if we treat capabilities linearly, that we can't copy {p -> i64} and thus we don't satisfy f's signature with this call:

f(a: &x, b: &x)

The "temporary" part of the law of exclusivity would require that a function may give back the capabilities it consumed. This is not necessary in the language presented in the paper, because the authors chose a continuation passing style. That said, all we have to do is to extend function types to describe this information. Hence, the type of the following function:

func f(a: inout Int) {
 /// ...
}

would be ∀[a, {a -> i64}](ptr(a), ptr(b)) -> Void @ {a -> i64} (notation borrowed from the paper on Mezzo). Now all is fine if we call g(a: &x). We temporarily consume the exclusive write access to x and give it back once the function returns.


To me, this looks like the premise of a relatively convincing presentation. There also has been some work already to extend this idea to more elaborate types, such as self-referencing data structures, which is comforting. That said, I can foresee some non-trivial interactions with other features, especially if one wants to prove non-observable aliasing (a very cool objective!) Specifically, I worry that this approach might be way too simplistic to deal with dynamic dispatch. My intuition is that soundness will require some kind of instrumentation of the operational semantics as well. I also suspect that formalizing definite assignment might be very difficult and involve complex workarounds to deal with a partially initialized self.

Other features (not included in FS) should also prove quite challenging. On the top of my head, I know that exceptions are often difficult to handle in flow-sensitive type systems. For similar reasons, Swift's defer may also be difficult to formalize elegantly.

I'm not sure we're connecting. My point is that all of this low-level modeling of memory and pointers doesn't seem like it should be necessary in order to describe value semantics, and seems like it will result in a much more complicated model than necessary. Presumably you can describe a purely-functional system without introducing memory; that's just math. Then you can model mutation as the rebinding of variables to new immutable values, right?

I do think it's important to eventually be able to describe how types with value semantics can be built out of traditionally-unsafe parts like pointers and memory, but that doesn't seem necessary for the purposes of laying a foundation for value semantics. This post outlines my general thinking and might be helpful in setting the scene a bit.

Is what I'm saying here making any sense at all? It wouldn't be the first time I was completely misguided :wink:

5 Likes

It appears to be the case indeed, my apologies.

I argue that it depends on the purpose of the resulting formalization.

Say I define some sort of operational semantics, i.e., the purely functional system you mentioned. Then yes, of course, I can model "mutation" in very simple terms. For instance, many lambda-calculi express assignment as simply as this:

let x := v in e ~> e[x/v]

This is expressive enough, I assume, to describe the kind of formal definition you suggested. Nonetheless, notice that such a syntactic trick does not support "reference semantics". This, I believe, necessarily requires to model memory somehow. But this is still just math (to be fair, so are Alias Types). There also has been some interesting approaches to do this "syntactically"; see [1] and subsequent work from Elena Zucca.

Now the question is: what do I want from this?

If the goal is to formally illustrate the properties you described in your proposal, then I completely agree. We do not need the kind of models I suggested. In fact, if we look at your bullet points, the simple assignment semantics above can almost illustrate everything:

  • Each variable of type X has an independent notional value .
  • A language-level copy (e.g., let b = a ) of a variable of type X has an
    equivalent value.
  • Given a local variable a of type X , safe code cannot observe the value of a
    except via an expression that uses a .
  • Given a variable a of type X , safe code cannot alter the value of a
    except by (...)
  • Concurrent access to the values of distinct variables of type X cannot
    cause a data race.

The two last points require a little more attention, because they are predicated upon a notion of reference. This comes back to what I meant about the requirement to model memory as some sort of "store", so that two names can refer to the same thing. I like @Lantua's definition of so-called points of access (see Avoiding unbreakable reference cycle with value types and closures? - #20 by Lantua) to describe this concept. We can model such a store in the "conventional" way, by the means of an ad-hoc data structure, or in the style of [1]. In the former case, I guess it would look something like the following:

S; var x := v in e ~> S, x=v; e[x/ref(x)]
S, x=v1; ref(x) := v2 in e ~> S, x=v2; e

We obtain a very simple framework (I don't even need types!) to formally illustrate your bullet points. We may also prove them on terminating programs. 1-2 seem trivial. 3-4 require some care about the ways one can interact with the store. 5 seems trivial, although I might be missing something more subtle from the definition.

Now, because this is described entirely in terms of operational semantics, we cannot emulate the compiler's behavior with respect to the guarantees it claims. Consider the following example:

func f(a: inout Int, b: inout Int) {}
var x = 0
f(a: &x, b: &x)

The compiler catches that this program is incorrect, correctly pointing out that Inout arguments are not allowed to alias each other. If we model this only through the operational semantics, perhaps using conditions on reduction rules (this is the "care about the ways one can interact with the store" part that I mentioned), then there's no way to say that the program is incorrect without executing it. This is a problem because there's obviously no guarantee that a particular evaluation does not diverge (even Featherweight Swift has recursion through self). To model this, I think it would be appropriate to use an elaborate type system, so that we may prove your properties (and others) statically.

I hope this makes sense, and that I didn't mischaracterise your point. Also, it appears as if I am absolutely unable to write shorter posts, so I thank any reader who will have made it all the way through.

By the way, I'm open to move this discussion to a phone call, even if it means summarizing the exchange in text here afterwards.


[1] Andrea Capriccioli, Marco Servetto, Elena Zucca: An Imperative Pure Calculus. Electron. Notes Theor. Comput. Sci. 322: 87-102 (2016)

None needed; successful communication depends on all participants. It's as much my responsibility as yours.

I agree!

Sorry, I'm afraid I don't know how to read that notation. Can you point me to a guide?

Yes, memory (not necessarily raw memory, but something that allows two variables to be bound to the same mutable value) is required for reference semantics. But the useful properties of value semantics all stem from the fact that you can reason about the behavior of value semantic types without introducing the complexities associated with memory. It's similar to what happens when you create a purely functional programming language and run it on real hardware, which always has mutation: the formal description of the language becomes simpler, and reasoning about code becomes simpler. The difference is that with value semantics

  • there's a more direct lowering to efficient machine code that takes advantage of the hardware's mutation, and
  • the many computations that are most naturally expressed in terms of mutation can be expressed directly.

But this is still just math (to be fair, so are Alias Types).

Sorry, by “just math,” I meant that it doesn't require any programming-specific notions beyond those of traditional algebra, where a variable is actually just a name for one unchanging value. I didn't mean to suggest that Alias Types aren't mathematically well-founded; I'm sure that they are.

There also has been some interesting approaches to do this "syntactically"; see [1] and subsequent work from Elena Zucca.

I'll take a look.

Now the question is: what do I want from this?

Did you ever answer that question? :slight_smile:

Whoa, how did you make those checkboxes? They don't show up in the quoted markdown.

Nit: you checked the first bullet, but I think to do that, you probably need to remove the word “independent,” which I take to effectively imply the 2nd-to-last bullet, which you didn't check (my list contains some intentional redundancy).

The two last points require a little more attention, because they are predicated upon a notion of reference.

Hmm, I'm not sure I buy that. They certainly don't use the word “reference” anywhere. They are predicated on the ability to alter the value associated with a variable. While that overlaps with the capabilities provided by what we think of as references, I claim it's significantly more restrictive.

This comes back to what I meant about the requirement to model memory as some sort of "store", so that two names can refer to the same thing.

Sorry, you lost me. I don't understand how you conclude there's a need to discuss names “referring” to anything from the language in those two bullets.

I like @Lantua's definition of so-called points of access (see Avoiding unbreakable reference cycle with value types and closures?) to describe this concept.

That seems like a useful way to describe a world with both value and reference semantics.

Again, I'll need some help with that notation.

That all sounds very interesting, but I'm afraid it doesn't make sense to me yet; I look forward to the moment where my understanding catches up with what you wrote :wink:

By the way, I'm open to move this discussion to a phone call, even if it means summarizing the exchange in text here afterwards.

Thanks for the offer. For the time being, I'd like to keep it here, though. The written form gives me enough time to consider everything you say, follow references, etc. without worrying about being frustratingly slow.

1 Like

This is a poor attempt at writing a (small-step) structural operational semantics rule in ascii form. This notation describes a sort of rewriting rule that states how to reduce the expression let x := v in e into an expression e in which v is substituted for all free occurrences of x. For instance, the rule lets us conclude that let x := 2 in x + 3 reduces to 2 + 3. The use of the term free relates to scopes, so that let x := 2 in let x := 3 in x eventually reduces to 3, not 2.

A complete yet simple example of a programming language described in terms of SOS rules is given here: IMP: a simple imperative language.

I did not :face_with_hand_over_mouth:. Many PL researchers, myself included, are in a never ending quest for type soundness, following the "well typed cannot go wrong" slogan from Robin Milner.

In more concrete terms, I would like to show that if my type system says "OK, your program looks fine", then it cannot possibly crash at runtime (except for an explicitly unsafe operation, like x as! T).

Just type left and right brackets []. If you want the tick as well, then type [x].

My understanding was that the 2nd to last bullet only related to mutation, whereas the first related to freedom from observable aliasing. I might be misunderstanding something. I might be confused by the use of the term variable. In let x = 42, would you consider x to be one?

I prefer the term binding, as defined in Swift's language reference. This enables a clear distinction between let (a.k.a. constant) and var (a.k.a. variable or mutable) bindings. Under this interpretation, the first bullet holds in the absence of mutation, whereas the independant part would be threatened by mutation.

We might be in complete agreement, but simply attributing different definitions to the term "reference". Perhaps the phrasing "they are predicated upon a notion of reference aliasing" would be better suited?

The 2nd to last bullet mentions, as one of the exceptions: "passing the expression as an inout parameter." This suggests that, at some point of the execution, there must be two bindings for the same value, two points of access.

Say we have the following program:

func f(x: inout Int) {
  let b = x
}
var a = 26
f(x: &a)

There are a two observations to make:

  1. Since a is mutable, the substitution trick does no longer work. Indeed, if we just "rewrote" a as the value to which it is bound, then only the first assignment would be sound, while subsequent ones would attempt to assign a value to a value (e.g., 1 = 2). Instead, we need a way to preserve the information that a is bound to some value throughout the evaluation (or at least as long as a is in scope).
  2. The parameter x is an alias on a during the call to f, so we need a way to accurately describe this situation in the operational semantics. In other terms, we need a mechanism to specify that two names (although not usable at the same time) are indeed bound to the same value, otherwise x cannot alter the value of a .

Both observations suggest the existence of a mapping that keeps track of the current bindings in the program. Aliasing further suggests that this mapping is surjective.

Once again, those are two inference rules. But rather than describing the reduction of an expression into another expression, they describe the reduction of a configuration into another configuration. Here, a configuration is a par (S, e) where S is the mapping I discussed above, and e is an expression.

The first rule relates to the declaration of a mutable binding, and therefore modifies the mapping (or store) S so that we can keep track of the value to which the name is bound. It also substitute ref(x) for x in the expression, so that subsequence accesses to x are known to involve the store. The second rule relates to the update of a mutable binding, by merely modifying the store.

Consider the following expression var x := 2 in x := 3 in x, which is roughly equivalent to the following Swift code:

var x = 2
do {
  x = 3
  do { print(x) }
}

Starting from an empty store, the reduction of the expression would be as follows (borrowing from a more familiar and hopefully clearer Swift syntax):

([:], "var x := 2 in x := 3 in x")
([x: 2], "ref(x) := 3 in x")
([x: 3], "x")
([x: 3], "3")

Again, the term alias might be more appropriate than ref here.

I'm happy to do my best clarifying anything that may seem obscure.

3 Likes

Got it. Oh, this is fun!

We're going to need to agree on some terminology, though. That is what I call “initialization,” not “assignment.“ The difference is that initialization creates a scope in which a name is bound to a value, but assignment is an inherently mutating operation. Depending on your point of view, you might think of that as mutating a stored value in-place, or as mutating the binding of a variable within a given scope (more on those two points of view below):

var x = 3 // initialization
x = 4     // assignment

These names come from my background with C++ (which I know is broadly eschewed in PL research). If you have other words for these distinctions, I'm happy to consider using those instead.

I did not :face_with_hand_over_mouth:. Many PL researchers, myself included, are in a never ending quest for type soundness, following the "well typed cannot go wrong" slogan from Robin Milner.

In more concrete terms, I would like to show that if my type system says "OK, your program looks fine", then it cannot possibly crash at runtime (except for an explicitly unsafe operation, like x as! T ).

Yes, I'm well aware. Describing value semantics doesn't depend in any way on having the ability to express type- or memory-unsafe computations, so I think that might be a somewhat orthogonal issue.

In my view those things are inextricably linked. Unless we needlessly attach a notion of identity or address to values, the only way to observe aliasing is if there is mutation. Conversely, since identity is uninteresting in the absence of mutation, I don't see any motivation for including identity in the model.

Note: in the absence of identity and mutation, there's no way to distinguish reference semantics from value semantics. Because of the crucial role of mutation in making the distinction interesting, for some audiences, I've taken to calling value semantics “mutable value semantics.”

I might be misunderstanding something. I might be confused by the use of the term variable . In let x = 42 , would you consider x to be one?

Calling a let binding a “variable” is not incompatible with what I mean, but for the purposes of my description of requirements, which is designed to help a programmer ascertain whether a given type has value semantics, it's important to consider things that can happen when x is a var binding. So the simple answer is “no.” The more nuanced answer is “yes, but let bindings are a special case of variables that don't allow us to fully explore whether a type has value semantics.”

I prefer the term binding , as defined in Swift's language reference . This enables a clear distinction between let (a.k.a. constant) and var (a.k.a. variable or mutable) bindings.

OK, so this is where the two points-of-view come into play. There are two equivalent, simple mental models for thinking about mutable value semantics, where the word “variable” here includes names defined with both var and let:

  1. Variables are (potentially mutably) “bound to” immutable values. Initializations, assignments, and pass-by-value implicitly copy the binding, so two variables can be thought of as being bound to the “same” value. However, because the values are immutable and there's no identity, “same value” is meaningless apart from differences that could be distinguished by an == operation.
  2. Variables are “containers of” a single (potentially-mutable) value. Initializations, assignments, and pass-by-value implicitly create a copy of the entire contained value of the source.

The models are interchangeable for the purposes of describing operational semantics (neither one captures the computational complexity of operations on every type with value semantics). The design of Swift used the second model, in part because it allows us to describe references as values that refer to stored class instances, and because it maps more naturally to aggregate structures that aren't full of pointers.

Under this interpretation, the first bullet holds in the absence of mutation, whereas the independant part would be threatened by mutation.

As noted above, value semantics is not even interesting in the absence of mutation.

We might be in complete agreement, but simply attributing different definitions to the term "reference". Perhaps the phrasing "they are predicated upon a notion of reference aliasing" would be better suited?

I don't think so. If you're trying to say that absent the possibility of aliasing, value semantics aren't worth describing, because there's nothing else in the universe, I could understand that, but I would still disagree. That would be like saying integers have no properties worth formalizing in a universe that doesn't include fractions.

The 2nd to last bullet mentions, as one of the exceptions: "passing the expression as an inout parameter." This suggests that, at some point of the execution, there must be two bindings for the same value, two points of access .

This sounds like a misunderstanding of inout…

The law of exclusivity guarantees that there is no aliasing. The simplest way to look at a function taking an inout parameter is in terms of a rewrite rule f→f', in which each inout parameter to f is passed by value to f' and “tuple-ized” into the return value of f', and a corresponding rule for the use-sites of f that assigns into the inout variable. So your code (which would be a little more illustrative if it actually did some mutation) could be rewritten as:

func fPrime(x: Int) -> (Void, x: Int) {
  let b = x
  return ((), x: x)
}
var a = 26
(_, a) = fPrime(x: a)

You can either look at this as:

  • the value contained in a is copied into the parameter x and which is copied into the last element of the returned tuple, which is then copied back into a, or
  • the parameter x is bound to the value a is bound to; the last element of the returned tuple is bound to value x is bound, and finally a is rebound to the value that the last element of the returned tuple is bound to.

When Swift gets noncopyable types (and officially gets _modify accessors), the mental model has to change slightly: the value of the inout parameter is “exclusively borrowed” for the duration of the call to f (see Rust). Either way, exclusivity guarantees that there's no aliasing.

Both observations suggest the existence of a mapping that keeps track of the current bindings in the program.

That sounds like it's just restating the “binding” point of view for languages that have mutation: if you are going to think of all mutation as rebinding variables, then the state of the program is captured in the set of current variable bindings.

I'll try to grok the rest of your post while you read this :slight_smile:

1 Like

Yeah, right! :grinning:

I completely agree with the second part, but I would distinguish initialisation from declaration. Otherwise there's no way to accurately describe the semantics of let x: Int; x = 0.

Okay, I think I understand where you're going. Basically you would express value semantics purely in terms of substitution. That's an interesting presentation.

I'm comfortable with these terms.

I agree.

I agree. I believe there's a sentence in my paper that reads "We also elide the difference between value and reference types, whose respective observable behaviors are indistinguishable in the absence of side effects."

Seems like a good name. Although one should be careful not to overburden the discussion with too many "custom" terms. If one is to understand value-binding as giving names to storage (see below), then I believe one does not need to distinguish between different kinds of mutation semantics.

This is also the way I would have described the model. In fact, to me personally, this view is much, much easier to grasp than any definition of identity, notional value, etc. Names denote immutable (let) or mutable (var) storage. Value-binding declarations introduce such names in scope, with the desired mutability.

Note that it also gives a simple way to explain the role of let. It is the good old bitwise immutability of the value being bound, as offered by const in Javascript, val in Scala, final in Java, etc.

As an aside, I really dislike the term "variable" in general, and will go as far as to say it is harmful.

I do not suggest such a proposition. I think the confusion stems from the fact that I did not understand how your definition does indeed includes some redundancy. It is much clearer now.

Well, I believe I have in fact exactly described an "exclusively borrowed" alias (notice that I mentioned "although not usable at the same time").

The law of exclusivity does not guarantee freedom from aliasing. For context:

If a storage reference expression evaluates to a storage reference that is implemented by a variable, then the formal access duration of that access may not overlap the formal access duration of any other access to the same variable unless both accesses are reads.

It is sound to think of an inout parameter as an alias, in terms of the operational semantics, because the "law" is in fact described in terms of the static semantics. Therefore, it is in fact irrelevant to know whether the caller of a f holds an alias to x, as such an alias is inaccessible for the duration of f's call. This is merely a restatement the of a Rust-like exclusive borrowing mechanism.

That said, I do not have any particular objection about the model you suggest, in which inout parameters would be sugars for simply producing the changes in the function's codomain. In fact, this is the mental model I used to understand mutating when I learned Swift. Furthermore, I coincidentally offered a similar presentation for the static typing of inout parameters using alias types. Notice how the signature ∀[a, {a -> i64}](ptr(a)) -> Void @ {a -> i64} is awfully similar to that of your fPrime function. :sweat_smile:

My intuition is that using a more classical approach, where assignment is expressed as as mutation of some store, will eventually lead to a more concise and unsurprising model. But it might be interesting to explore both avenues!

I think our discussion is converging! I'm just out of a very intense conference, so I might take this weekend off, but I'll try to get started on a working formalization very soon. I'll focus on the operational semantics; we can discuss the type system later =)

Sure; I was just giving an example. In yours, the part after the semicolon is an initialization, not an assignment.

I'm not sure I understand what you mean by “in terms of substitution.” FWIW, I'm not expressing a preference for either of those points of view, if that's what you mean by “would.” I don't normally take the “mutable bindings to immutable values” point of view, but it can be useful because you can clearly see where the properties of value semantics come from in that case: it's in the immutability of the value that usefully limits expressive power.

  • “Mutable value semantics” is not intended to draw a distinction not with some other form of mutation semantics, but with pure functional programming, where everything is a value regardless of representation, because of the inability to express mutation.
  • Of course one doesn't need to distinguish between different kinds of mutation semantics. If your only goal is “one formalism to rule them all,” definitely get on with describing everything in terms of memory and storage, and I have nothing to contribute to that project. The point of distinguishing value semantics from not-value-semantics is that when you know you are dealing with types having value semantics, it's possible to reason locally about the effects of mutation. When reference semantics are in play, every mutation potentially has potentially arbitrary non-local effects.

I'm confused. The term “binding,” which you said you prefer, is not particularly useful for that model, at least the way I think about it. For me, “binding” calls to mind a separation between the name and its value such that two names can meaningfully be bound to “the same” value. In the second model, each name has its own bucket for all the bits of a value.

Also, you can't get away from notional value if you want to talk about value semantics (see below).

Names denote immutable ( let ) or mutable ( var ) storage.

Check

Value-binding declarations introduce such names in scope, with the desired mutability.

I'm confused again. Is this a “value-binding declaration?”

let x: Int

It introduces a name in scope and specifies a mutability, and yet there's no value involved and I don't see a way to usefully talk about “binding.”

Note that it also gives a simple way to explain the role of let . It is the good old bitwise immutability of the value being bound, as offered by const in Javascript, val in Scala, final in Java, etc.

I have to disagree. While let does indeed imply bitwise immutability of all locally-stored struct properties, the role of let is greater when the notional value of the represented type is not captured in the bits of the locally-stored properties. Array<Int> is an excellent example. Locally, it contains a single reference to dynamically-allocated remote storage. The notional value of the array includes the values of bits in that remote storage. That fact is reflected by the fact that those bits can never be mutated if they are referenced from a let-bound Array. That, and the fact that we never mutate those bits if the storage is shared, allows us to reason about all the Ints in the array as part of the array's independent value. The value-semantic properties of Array<Int> come from the fact that it has a well-defined notional value that is reflected in the choice of which operations are mutating and in its semantic isolation from other array values.

As an aside, I really dislike the term "variable" in general, and will go as far as to say it is harmful.

Can't say I agree, except inasmuch as it is used in Swift's existing literature to refer to both vars and lets, which is confusing.

Seems to me that if it's not usable at the same time, calling it an alias has no meaning. And even with non-copyable types, you don't need to think of it as a borrow. Semantically, you could use a similar rewrite as I showed for copyable types to move a noncopyable value from its storage and then move it back.

Yes, the language above is designed to cover accesses by unsafe pointer. If I must be absolutely precise, it guarantees freedom from observable aliasing of value types, via safe operations. I assume it's interesting for you to discuss properties of programs built out of Swift's safe subset.

It is sound to think of an inout parameter as an alias, in terms of the operational semantics, because the "law" is in fact described in terms of the static semantics. Therefore, it is in fact irrelevant to know whether the caller of a f holds an alias to x , as such an alias is inaccessible for the duration of f 's call. This is merely a restatement the of a Rust-like exclusive borrowing mechanism.

Yes, it's sound to think of it as an alias, but also inconsequential, AFAICT, except that it compromises our power to reason about the code because we have to think “alias that has only one possible access point at any time,” which is a complicated way to say… nothing?

That said, I do not have any particular objection about the model you suggest, in which inout parameters would be sugars for simply producing the changes in the function's codomain. In fact, this is the mental model I used to understand mutating when I learned Swift. Furthermore, I coincidentally offered a similar presentation for the static typing of inout parameters using alias types. Notice how the signature ∀[a, {a -> i64}](ptr(a)) -> Void @ {a -> i64} is awfully similar to that of your fPrime function. :sweat_smile:

My intuition is that using a more classical approach, where assignment is expressed as as mutation of some store, will eventually lead to a more concise and unsurprising model. But it might be interesting to explore both avenues!

If your goal is a minimal formalism that can describe the entire Swift language, I agree that you may just want to ignore value semantics (heck, just create a formalism for LLVM and then you can describe-all-the-languages). If you want to create a formalism that supports more easily reasoning about large classes of Swift programs, then I claim it's worth thinking about.

I think our discussion is converging! I'm just out of a very intense conference, so I might take this weekend off, but I'll try to get started on a working formalization very soon. I'll focus on the operational semantics; we can discuss the type system later =)

Very cool!

3 Likes

I think we are on the same page.

I guess I have been a little stubborn with the idea of modeling memory to represent inout parameters. The reason is that I think all the properties you mention (immutability as a useful limit on expressiveness and local reasoning) can be defined with a static semantics that would restrict the set of behaviors of an unrestricted operational semantics. In fact, I postulate that most if not all points on which our opinions diverge stem from this!

That said, I think it is clear that expressing value semantics at the operational level results in much a simpler presentation, which does not have to deal with linear types, capabilities, ownership, etc... Therefore, if the goal is to formally explain/illustrate value semantics to a human audience, then it is a better approach.

I see a binding as one pair domain/codomain of a function that associates visible names to their value. It does not really matter if the values are stored in individual buckets, in shareable memory cells. The "binding function" can abstract over these details.

Given a binding function f, and a name x such that f(x) = v, the binding's mutability dictates whether or not I am allowed to update f to a function f' such that f'(x) = v'. When v has reference semantics, although I cannot perform such a mutation, I still can modify v.x (provided it is a var property).

In more concrete terms, this is legal:

class Ref {
  var x: Int = 0
}

struct Val {
  var x: Ref
  var y: Int = 8
}

let a = Ref()
a.x = 1

let b = Val(x: Ref())
b.x.x = 1

In the first case, the the bucket contains a pointer, i.e., f(a) = 0x.... Because of let, I am not allowed to change f(a) but the immutability does not propagate to the contents of the pointee.

In the second case, the bucket contains a struct, i.e., f(b) = { 0x..., 8 }. Because of let, I am not allowed to change f(b) and therefore I cannot mutate the integer neither, even if it is a var. Yet the immutability still does not propagate to the contents of the pointee.

In both instances, let specifies the immutability of the binding in the sense that I cannot redirect f(x) to another bucket (after initialization).

I hope this perhaps very unique mental model makes sense and that I am not embarrassing myself.

It is useful to call it an alias nonetheless, at least to be precise in the definition of static invariants. This is why the law of exclusivity guarantees freedom from observable aliasing. On this we fully agree.

I understand what you mean, and I agree.

The distinction is only useful to express static invariants, in the context of a type system (see above).

That is not what I meant :sweat_smile:. Determining the right level of abstraction in any formal model is a difficult problem, for many reasons. I genuinely think that both approaches can help reasoning about large classes of Swift programs, even if the one I originally suggested sits at a slightly lower level.

You will find a work-in-progress technical paper defining Swift's value semantics here

I refrained from defining notions of memory and/or references. Assignments merely update a set of bindings mapping program variables in scope to their respective value. It follows that inout parameters are handled the way you suggested, by reassigning the arguments at the call site ones the function returns. This results in a rather elaborate mechanism, but overall I find the complete semantics relatively straightforward (at least to a someone familiar with formal semantics).

I believe the semantics satisfies your bullet points. However, it does not (yet?) deal with either mutating methods, accessors and subscripts (in fact, it does not support any notion of method) nor concurrency. I think we could get away without methods, as they do not significantly improve on the language's expressiveness. However, concurrency would definitely require more attention.

The paper develop several ideas of extensions. I think the most pressing ones are to define reference semantics, for comparison.

Would you like to co-author the evolution of this paper with me? I would love to turn this work into something that can be used by the community at large, and I think the insights of a practitioner would help tremendously.

I agree that is possible.

That said, I think it is clear that expressing value semantics at the operational level results in much a simpler presentation, which does not have to deal with linear types, capabilities, ownership, etc... Therefore, if the goal is to formally explain/illustrate value semantics to a human audience, then it is a better approach.

Agreed.

In both instances, let specifies the immutability of the binding in the sense that I cannot redirect f(x) to another bucket (after initialization).

I hope this perhaps very unique mental model makes sense and that I am not embarrassing myself.

It seems to me the model you just expressed is the “mutable bindings to immutable values” point of view I described earlier. The reason I say that is you talk about let/var controlling whether you can “redirect f(x) to another bucket,” even when x names a struct; in the other model you'd fill up the same bucket—permanently associated with the name x, with a new value.

As I said earlier, the models are equally valid. The only thing that confuses me is that you seemed to express a strong preference for using the word “binding,” which implies “redirecting” the value-binding of a var at a different value container, while also expressing a strong preference for the variables-are-containers-of-potentially-mutable-values point of view. My guess is that you hop fluidly between the two points of view without realizing it.

You lost me; what invariants?

This is why the law of exclusivity guarantees freedom from observable aliasing. On this we fully agree.

I'm confused. It seems to me that when talking about language semantics, the unobservable has no meaning.

Sure, and you can reason about the semantics of a program whether it is written in C or Haskell, and either one is more helpful than having no abstraction at all. But they present different trade-offs. Your lower-level description can be applied to a larger class of programs, but unless you build a higher-level formalism for describing value semantics into the system (perhaps on top of the lower-level formalisms), I claim a huge sub-class of those programs will be much more difficult to reason about than necessary.