Introducing Swift in higher education

kennyc · November 18, 2021, 4:49am

The Node might be getting passed down to a view for presentation to the user, and the view in turn just happens to store the Node in a local property. The view might allow the user to mutate some properties on the Node, like expanding it to show the children or editing the name field. Those edits need to propagate back up to some source-of-truth in the application, which in this case might be a data model or document holding a Tree of Nodes.

If I take my trivial example and bring it into a "real world" application, then I routinely find myself with views that have properties referencing some shared state.

Forum friendly examples are always tough, but maybe this will help a bit:

Thinking in AppKit/UIKit as SwiftUI is quite different.

final class Model { 
  var tree: Tree { 
    didSet { 
      // Notify ListView and maybe DetailView...
    }
  }
}

final class ListView { 
  var presentedTree: Tree? 
}

final class DetailView{
  var presentedNode: Node? 
}

final class SummaryView { 
  var presentedTree: Tree? 
  var presentedNode: Node?
}

In this example, a mutation of the tree in the Model will trigger a CoW because ListView and SummaryView each have a reference to it. A mutation of any property in the DetailView needs to propagate back to the Model and then down to the other views.

So either the Model or the Tree needs some API that takes a Node, or Node.ID and can mutate the tree to reflect the desired changes. IMHO, that API is challenging to build because it's not clear how it should defined. And if Tree is something a lot more complex, with deeply nested properties, then it can be challenging to "find" the instance of the property to mutate.

Now, there's a strong argument to be made that perhaps the Views should only have access to the Model. Perhaps that's the solution, but the Model is likely to be a reference type so now all we've really done is hide the value type behind a reference type.

Perhaps a Tree example is a bit too simplistic to express myself. Consider perhaps a Document type that represents a rather complex type with many levels of nested properties and sub-types.

I would find it strange to pass around a Document to every view that needs to display some subset of the document. If the Document had layers and a I had a view for showing Layers, then I'd expect to just give said view an Array of layers, not the entire Document.

But any mutations of those layers, like their ordering or visibility, need to get back to whatever component is holding a reference to the "main" Document itself. Document have an API for updateLayers() or maybe there some more functional approach like updateLayers(document, layers) -> Document.

If I understand you correctly, then you're suggesting that the Tree itself should be passed around instead of a Node? That would certainly allow any view to mutate the Tree, but now you have multiple references to it everywhere.

tera · November 18, 2021, 4:50am

you can optimise it a bit in one big function that allows to set multiple params at once.

In my version of Tree/Graph having name/expanded vars would be error prone:

func foo() {
    var node = tree.node(for: 123)!
    // ....
    node.isExpanded = true
    // forgot to propagate it back to the tree
}

rayx · November 18, 2021, 5:42am

dabrahams:

/// The identity of a person
typealias PersonID = Int

struct Person {
  var name: String;
  var address: String;
}

struct SocialNetwork {
  private var storage: [(Person, friends: [PersonID])] = []
  
  mutating func add(_ newMember: Person) -> PersonID {
    storage.append((newMember, []))
    return storage.count - 1
  }
  ...
}

Hi, Dave, I have two questions on the code. They are mostly implementation details.

You use the array index as the value of PersonID. I understand that makes it fast to retrieve a Person value by its ID (I suppose that's why you do it this way), but doesn't that makes PersonID value coupled with the storage format? It seems a bit inflexible to me.
Is there any reason why you don't put friends in the Person struct? In practice, a Person may have many properties (e.g., tweets, likes, etc.). I think it's natural to put them in the struct?

That said, if friends is moved to the struct and suppose we would like to implement a method, say, getFriendNames(), for Person, then the method would need to take a closure to translate PersonID to name. This is a minor example why I said value type often required functional programming techniques.
```
 // Suppose friendIDs is moved to Person struct
 extension Person {
     func getFriendNames(_ idResolver: (PersonID) -> Person) -> [String] {
         return friendIDs.map { idResolver($0).name }
     }
 }
```

rayx · November 18, 2021, 6:01am

I actually mixed functional programming and imperative programming in my app. I'd think it's functional in general and imperative in local code.

Take your social network example as an example. The storage format is simple. Let's assume that we need to implement a mutating method, and that method is so complex that we must convert the storage format to an intermediate format (e.g. a graph) to determine how to modify it. This is the case in my app. And I do it in the following steps:

convert the storage data to an intermediate data
determine action (the change to perform) based on the intermediate data
modify the storage data in place.
repeat from step 1 until there is no action needed.

Note that I do mutate values in step 3, but on the architecture level I'd think this is a functional way (please correct me if I'm wrong), because for all steps the data it passes to next step is not mutated.

In general I think when doing value type programming, one main issue I often faced is how to access or modify a value related to the current value in an elegant way. That's where I found function programming style is helpful. In addition to the above architecture level approach, I used a lot of other approaches, like carefully designed storage format, higher order function, or just defining function on the top-level struct, etc. That's why I said value type required functional programming.

I do use inout and it's very helpful. But I don't think it's helpful in the above example where I need to convert storage format to intermediate format, does it?

That would be my honor. I realized it in my struggle when I rewrote my app using value type, and I'm glad I worked the solution out.

dabrahams · November 18, 2021, 6:30am

Sorry, sometimes I forget how deeply reference-semantic thinking is ingrained in the culture…

When you store an instance of value-semantic Node, you're storing (a copy of) its value, just like when you store an Int, you're storing a copy of its value. You're not surprised that if you store the x coordinate of a Point, you can't magically reach back into the Point and mutate it just by changing your stored Int, are you? So, no, you don't get to do that with a value semantic type. Storing a copy of the Node locally is of no use to you in affecting a mutation on some other Node; you really have no reason to be storing a copy of it unless you have some use for a snapshot of its whole value, which includes the values of all its children.

If I take my trivial example and bring it into a "real world" application, then I routinely find myself with views that have properties referencing some shared state.

Forum friendly examples are always tough, but maybe this will help a bit:

Thinking in AppKit/UIKit as SwiftUI is quite different.

Ah, yeah… part of the whole point of SwiftUI was to move away from the rampant reference semantics that pervades the Cocoa kits.

final class Model { 
  var tree: Tree {

you've already left the world of value semantics as soon as you have a class with a mutable member. If you want to stay within value semantics, make your model a struct. Or in your case since there's nothing in it but a Tree, you could

typealias Model = Tree

Something has to own the model; that's the one thing with a stored Model property. I'm not an expert on the Cocoa kits, but I'm guessing if the model is your document, then probably the Application class (or its delegate) should own it, and if this is more like a representation of the state of some view, maybe the ViewController (or its delegate) is the right owner.

In this example, a mutation of the tree in the Model will trigger a CoW because ListView and SummaryView each a reference to it.

Well, no, they each have a copy of it, and thinking of them as references will get you in big trouble. But yes, if you store copies of arrays and then mutate them, you should expect to pay for it. These are really copies (which is why you can't use them to mutate one another), and CoW is just a way of postponing (most of) the cost of making them… possibly forever, if you never mutate them.

A mutation of any property in the DetailView needs to propagate back to the Model and then down to the views.

Nope. I don't mean to be flip, but that's just not the way to think about these values.

Now, it may be that the Cocoa kits are really hostile to value semantics; they are, after all, based on an “everything is a reference” programming model. For example, maybe there's a method like this that you need to override in order to affect any interaction at all from your DetailView:

func onTap()

There's no way for this method to be passed an inout Node for mutation. Okay, then you need to make some compromises. Maybe you bring back the Model class and spread references to it around the views, but the views store keypaths into the model that describe the “address” of the specific node to which they correspond. At least then the Law of Exclusivity would dynamically ensure no overlapping accesses to the forest, and below that level you'd have static guarantees from value semantics. Such a keypath would be equivalent to an [Int] storing the child indices in a path from the root of the model. But as I understand it, the best practice for keeping everything in sync is to rebuild/adjust the view hierarchy to match the model, using one consistent procedure, after each mutation… so you would not actually expect to ever read the model through these stored references. I learned about the consistent rebuild from @kylemacomber, who knows a lot more about how to develop GUI apps using the Cocoa kits than I probably ever will, so I hope he will chime in with details.

Now, there's a strong argument to be made that perhaps the Views should only have access to the Model .

I'm no expert, but I think MVC is supposed to isolate views from models?

the Model is likely to be a reference type so now all we've really done is hide the value type behind a reference type.

That is sometimes a necessary compromise, but it still offers some protection that you don't get by making a fine-grained network of interacting objects.

Not if it's truly a tree, and you don't need to traverse from child to parent or across siblings. That's perfectly well representable by a simple value because of the whole-part relationships I mentioned earlier. I was describing what you do for traversing an arbitrary graph. Then (among other approaches) you can give each vertex an integer index and represent all of the edges in the graph as [[Int]].

You have to expunge that from your thinking and your language if you want to deal effectively with value types directly. There is no "reference to a value." If Tree is a value type, there is no reference-to-a-Tree.

dabrahams · November 18, 2021, 6:34am

You could say Int is error-prone for the same reason. Seems to me this is like using a printing press and being surprised when it doesn't act like a chainsaw.

kennyc · November 18, 2021, 7:08am

Very much so for some of us... ;)

No, but coming from Objective-C if I have an integer or a point, then I view those as primitive types. But if my view had some richer object, like a Person, then I would expect to be able to edit those properties. If the Person was a class, which it would likely be in Objective-C, then every other view with a reference to that Person would see the changes.

(Admittedly, that's a problem unto itself and something value types try and prevent.)

I agree. Trying to use value types from top-to-bottom is the goal but that's where I'm hitting some walls. Inevitably, some classes start popping up and kind of break the implementation.

I'm playing a little loose with my terminology, but you're right. When I think of having a "reference" to something when discussing a type like a Tree, what I'm really thinking of is the reference to the Array's underlying shared storage.

So while I do have a copy, my copy does have a reference under the hood. This has tripped me up in the past when benchmarking some code and wondering why a seemingly simple mutation, like toggling a boolean property, was so slow. Turns out a copy I had made of a property was triggering CoW, but the performance degradation was only noticed when mutation occurred, now when the copy occurred. I completely understand why, but in my (very) early days of learning Swift it caught me off guard.

Point being, if I do have an extensive data structure that is getting mutated, it's important that nobody else have a copy of it if performance is important. That means checking if any views or intermediate components happen to be making a copy purely for convenience, or perhaps bad habits.

In some ways, that's the answer I'm trying to determine. When looking at functional languages and functional frameworks, value semantics, or immutable data, make a lot of sense and it's relatively clear how to program within those constraints.

When dealing with pure Swift, I think that's also the case. But when using Swift within the context of an AppKit or UIKit application, where I'm sure the vast majority of Swift is used, then for me at least it gets a lot harder to reason about.

It might very well be that these older frameworks are not entirely compatible with pure value semantics, which is fine. But I've been curious if my difficulties are because of this or that I'm just "doing things wrong".

This is sort of the approach I've been playing with. A sort of "view model", which is a class, owns the data which is represent as a value type. How to "address" things and make the mutations is what started this thread for me.

You're right. As noted above, what I was trying to say was that a copy of a Tree also has a copy of rootNodes: Array. Each copy of that Array does have a reference to the shared, underlying storage.

Thus, two views, each with their own copy of the Tree, indirectly also have a reference to the underlying storage. That's where I'm (incorrectly?) applying the word "reference".

– Appreciate all the input and lengthy replies.

rayx · November 18, 2021, 12:46pm

@kennyc I read the discussion between you and @dabrahams carefully. I think you seemed quite confused. Below are the key points that I think may be helpful:

The data in view model is just a snapshot. They are read-only. Yes, you are able to modify the model on user input, but not this snapshot. This snapshot is a separate data from the modal on concept, but on implementation level there might be COW when you change the model. I don't think that's necessarily an issue because in my opinion one should only pass a small portion of the model that's related to the view to the view (I don't agree the point that the entire tree and document should be passed to view).
When the view receives user input, it modifies the model, which in turn generates a new snapshot for the view. As for how view is able to access the model, I think it's an implementation detail. For example, you can put it in Application class or define it as a global variable.

Also note that the "view" here is a general concept and isn't necessarily related to GUI.

rayx · November 18, 2021, 12:54pm

@dabrahams I have a question on this. I understand the view must have some way to access the model so as to modify it. I usually think it can be implemented by save the model value in Application class (as you mentioned) or just as a global variable. But you seemed to suggest to pass the entire model (e.g. a graph or a document, as a inout parameter I suppose) to each view. Is this a common practice?

BTW, a practical issue I often face in value type programming is how to access sibling or containing struct. For example, suppose we have the following data model:

  top-level struct (the entire data modal)
  -> struct 1
     -> struct 1A
     -> struct 1B
  -> struct 2

and suppose I have a method foo() in struct 1A which need to access struct 1B or struct 2. I usually implement it by passing closure, but I often wonder if this is a common pattern or if this is the kind of issue that I should avoid in the first place?

svanimpe · November 18, 2021, 12:55pm

Thank you everyone for your input. It's nice to see people passionate about what they do, or what they'd like to change, and it was educating for me to read some of these comments.

However, there are some things I feel like I should clarify:

The curriculum I'm trying to build will be similar to what I teach in my day job, which is a Bachelor's degree in Applied Computer Science. This is quite different from an academic Master's degree, as it is mostly focused on app and web development and systems administration, and doesn't include the formal courses one would typically find in an academic research-focused curriculum.
This kind of curriculum covers many of the jobs in our industry. My employer, for example, is the biggest supplier of graduates in our country.
One of the primary goals for this PWS Academy I'm trying to start is to be inclusive. Part of this inclusivity is recognising that most development jobs don't require an in-depth understanding of type systems, value semantics, generics, ..., and that this is fine. At no point should a developer feel inferior or less proud about what they do, just because their language doesn't have these features, or they don't have an interest in learning about, for example, functional programming.
Discussing these topics in a beginners course is a sure-fire way to discourage most students from pursuing programming any longer. They'll just think programming is not for them after all, and give up.
I chose Swift as a teaching language not because of its relevance in industry (which, if we're honest, is rather small), but mainly because it allows me to teach both introductory topics (in this course) and more advanced topics (in future courses), in a language that values safety and clarity. My hope is that students will take what they've learned from Swift and apply it to whatever language they’ll end up using in their career.

W.r.t. value types, my plan was to teach this not through theory, but by letting the students use libraries and frameworks designed around value types (such as the Standard Library and SwiftUI), and then discussing how these libraries/frameworks achieve what they're doing with value types, instead of classes. I'm not sure if a truly in-depth discussion (meaning: trying to fully comprehend what @dabrahams is trying to explain ) even belongs in this curriculum. If I were to include that, it would have to be an optional cherry-on-the-cake course, as requiring every student to pass such a course wouldn't be very inclusive.

@John_McCall May I ask that this discussion on value types be moved to a different thread?

dabrahams · November 18, 2021, 4:46pm

No need for moderator intervention; as I said earlier, I'm happy to move: How to program (and teach programming) using value semantics

Thanks for your patience with this digression

John_McCall · November 18, 2021, 4:58pm

I'm not sure how to split that discussion out without performing a lot of surgery on individual posts. If you have specific suggestions, I can see what I can do.

Otherwise, I'll just ask that people respect Steven's request and take that discussion to Dave's new thread.

kennyc · November 18, 2021, 10:46pm

@svanimpe Great work on the course work and for the efforts in teaching others. Apologies for derailing this post with the deep dive into value types and semantics. Discussion has moved over to the thread Dave started. Cheers.