[Style] Explicit self when referencing instance member

paiv · March 4, 2019, 12:46am

This was discussed previously either to always require self SE-0009 (rejected), or have an optional compiler warning (rejected).

Let's discuss whether and when this should be a recommended practice.

Required vs Optional

There is only one case where you are required to write explicit self – to refer to a member shadowed by another variable.

Other cases are optional (for self in closures see below), and are open to discussion whether it should be a recommended practice.

Criteria

To decide whether or not to use explicit self when it is optional.

The arguments are:

it reduces errors
it improves code readability

The errors reported from not using explicit self boil down to:

confused shadowing
unexpected strong reference to self

1. (Required) Instance member is shadowed

Common shadowing scenarios. Explicit `self` is required.

1.a Argument to initializer

class Style1a {
    init(bar: Int) {
        self.bar = bar
    }
    
    let bar: Int
}

1.b Local variable

class Style1b {
    var bar: Int = 0

    func foo() {
        let bar = 42
        self.bar = bar
    }
}

2. Confused shadowing

2.1 Readability of shadowed code

class Style21a {
    func foo() {
        if var bar = bar {
            bar = bar + 1
        }
        bar = 42
    }
    
    var bar: Int?
}

class Style21b {
    func foo() {
        if let bar = self.bar {
            self.bar = bar + 1
        }
        self.bar = 42
    }
    
    var bar: Int?
}

Pros of explicit self:

Easier to reason, you can always tell when it refers to instance member

Cons:

When reading, self. adds noise

Alternatives:

Review the code in full, follow logic flow
IDE Review with "highlight all appearances of the selected symbol in the enclosing code scope"
IDE Rely on syntax highlighting difference for instance and non-instance variables

2.2 Adding shadowing

I just copy-pasted a working code block

Starting with this code:

class Style22 {
    func foo() {
        bar += 1
    }
    
    var bar: Int = 0
}

And then at some point introducing shadowing on line 3:

class Style22 {
    func foo() {
        var bar = 42
        bar += 1
    }
    
    var bar: Int = 0
}

This happens when pasting code, or just by adding local variable.

Pros of explicit self:

There will be no shadowing
Compiler might warn you of unused local variable, or assigning to read-only variable, but this depends on code pasted
When reading, you can always tell it refers to instance member

Cons:

You might never add shadowing, self is unneeded
When reading, self. adds noise

Alternatives:

Cover the code with tests to ensure proper behavior
Review the code in full, follow logic flow
IDE Review with "highlight all appearances of the selected symbol in the enclosing code scope"
IDE Rely on syntax highlighting difference for instance and non-instance variables
IDE could warn you that pasted code introduces shadowing

2.3 Removing shadowing

I thought it was a local variable

Starting with this code:

class Style23 {
    func foo() {
        var bar = 42
        bar += 1
    }
    
    var bar: Int = 0
}

And then at some point removing line 3:

class Style23 {
    func foo() {
        bar += 1
    }
    
    var bar: Int = 0
}

In a more complex code you might believe you still have a local variable.

Pros of explicit self:

If you miss explicit self, linter will warn you, forcing a review

Cons:

You have to update more code with self, if this was intentional

Alternatives:

Cover the code with tests to ensure proper behavior
Review the code in full, follow logic flow
IDE Review with "highlight all appearances of the selected symbol in the enclosing code scope"
IDE Rely on syntax highlighting difference for instance and non-instance variables
IDE could warn you of removed shadowing

3. Unexpected strong reference to self

Note that confused shadowing could lead to unexpected strong reference, but we need to reason about them separately.

Exercise 1

Can you tell the line that introduces a reference cycle?

let s = Style30()
s.some = s.bar
s.some = s.nada
s.some = s.niente

What if I move this code inside a method? (bar, nada, and niente are still members of this class)

class Style30 {
    var some: Any?
    
    func foo() {
        some = bar
        some = nada
        some = niente
    }

And if I add self?

class Style30 {
    var some: Any?

    func foo() {
        self.some = self.bar
        self.some = self.nada
        self.some = self.niente
    }

Does self reveal a reference cycle?

Here is the rest of the class:

class Style30 {
    var some: Any?

    func bar() {}
    let nada: Int = 42
    let niente = Inner()
    
    class Inner {}
}

Writing self does not reveal captured reference. To reason about references, you need to see the types involved, and have an implicit knowledge that method reference captures self. Nothing conveys this in language syntax – this is an unfortunate design deficiency.

Exercise 2

Can you tell the line that introduces a reference cycle?

class Style30b {
    var some: Any?
    
    func foo() {
        self.some = {
            self.bar()
            self.nada
            self.niente
            self.niente.x
        }
    }
    
    func bar() {}
    let nada: Int = 42
    let niente = Inner()
    
    class Inner { let x = 11 }
}

Any reference to instance member inside a closure needs that instance captured. So all the lines 6-9 need self captured in that closure, which introduces reference cycle when that closure stored back on the instance.

3.1 Method reference

class Style31a {
    init() {
        some = bar
    }
    var some: Any?
    func bar() {}
}

class Style31b {
    init() {
        self.some = self.bar
    }
    var some: Any?
    func bar() {}
}

Line 3 introduces reference cycle.

Pros of explicit self:

If you miss self, linter will warn you, forcing a review

Cons:

Explicit self doesn't reveal captured self
When reading, self. adds noise
More typing

Alternatives:

Review the code in full, with the types involved

The compiler might be patched SR-8536, requiring you to write self in this case. But if you write self everywhere, how does that help?

3.2 Closure

class Style32 {
    init() {
        some = {
            bar()
        }
    }
    var some: Any?
    func bar() {}
}

Call to method 'bar' in closure requires explicit 'self.' to make capture semantics explicit

class Style32 {
    init() {
        self.some = {
            self.bar()
        }
    }
    var some: Any?
    func bar() {}
}

Since any instance reference will need self captured, compiler now requires explicit self in this case.

Pros of explicit self:

If you miss self, ~~linter~~ compiler emits error, forcing a review

Cons:

When reading, self. adds noise
More typing

Pitch: I believe, forcing explicit self in this case was a confused decision. You should be able to tell instance members by other means, see shadowing above. And to convey explicit capture of self there is a cleaner syntax – capture list:

class Style32c {
    init() {
        some = { [self] in
            bar()
        }
    }
    var some: Any?
    func bar() {}
}

Forcing explicit self in closures on every member access introduced confusion: if it is required in closures, should I do it everywhere else? Why is it not required everywhere else?

Instead compiler should require explicit self in capture list, and let linter check the rest.

3.3 Auto-Closure

class Style33a {
    let bar: Int = 42
    
    func foo() {
        nada(bar)
        zilch(bar)
    }

    func nada(_ cb: @autoclosure () -> Int) {}
    func zilch(_ cb: @escaping @autoclosure () -> Int) {}
}

Reference to property 'bar' in closure requires explicit 'self.' to make capture semantics explicit

class Style33b {
    let bar: Int = 42
    
    func foo() {
        self.nada(self.bar)
        self.zilch(self.bar)
    }

    func nada(_ cb: @autoclosure () -> Int) {}
    func zilch(_ cb: @escaping @autoclosure () -> Int) {}
}

Pros of explicit self:

If you miss self, linter warns you, forcing a review

Cons:

Explicit self doesn't reveal captured self
When reading, self. adds noise
More typing

Pitch: Again, adding explicit self doesn't "make capture semantics explicit". It was a confused decision adding it to the compiler. To make it explicit, we need a different syntax, like:

func foo() {
    nada(bar)
    zilch([self] bar)
}

3.4 Nested function

class Style34a {
    var bar: Int = 0
    
    func foo() {
        func niente() {
            bar = 42
        }
        
        nada(niente)
        zilch(niente)
    }

    func nada(_ cb: () -> Void) {}
    func zilch(_ cb: @escaping () -> Void) {}
}

No compiler warnings, but niente captures self.

class Style34b {
    var bar: Int = 0
    
    func foo() {
        func niente() {
            self.bar = 42
        }
        
        self.nada(niente)
        self.zilch(niente)
    }

    func nada(_ cb: () -> Void) {}
    func zilch(_ cb: @escaping () -> Void) {}
}

Pros of explicit self:

If you miss self, linter warns you, forcing a review

Cons:

Explicit self doesn't reveal captured self
When reading, self. adds noise
More typing

Alternatives:

Replace nested function with a closure

Pitch: Compiler should warn you of captured self, and suggest using a closure with capture list. Or have an extended syntax for nested functions with a capture list, like:

func foo() {
    func niente() { [self] in
        bar = 42
    }
}

Section notes

A compiler/linter warning for a missed self is not to be blindly "fixed" by adding self, it is an opportunity to review that line. So if you type explicit self everywhere, there will be less compiler/linter warnings, triggering less reviews. Explicit self does not mean you have reviewed that line for captured self.

4. Improve readability

4.1 Explicit self always

class Style41a {
    var bar: Int = 2
    var nada: Int = 3

    var baz: Int {
        return bar + nada
    }
    
    func foo() {
        bar = baz + nada
    }
}

class Style41b {
    var bar: Int = 2
    var nada: Int = 3

    var baz: Int {
        return self.bar + self.nada
    }
    
    func foo() {
        self.bar = self.baz + self.nada
    }
}

Pros of explicit self:

Easier to reason, you can always tell when it refers to instance member

Cons:

When reading, self. adds noise
More typing

4.2 Always self in initializers

Given the common initializer scenario, prefer explicit self on every instance reference in initializer.

class Style42a {
    init(bar: Int) {
        self.bar = bar
        nada = 42
    }
    
    let bar: Int
    let nada: Int
}

class Style42b {
    init(bar: Int) {
        self.bar = bar
        self.nada = 42
    }
    
    let bar: Int
    let nada: Int
}

Pros of explicit self:

Easier to reason, you can always tell when it refers to instance member
Uniform style easier to read

Cons:

When reading, self. adds noise
More typing

IDE Support

Highlight selected symbol

This is very useful, but relies on lexical scope, thus beyond simple text editors.

selected symbol|

Syntax highlighting

Have different style for instance and non-instance variables. This relies on lexical scope.

syntax highlight

Notify of added/removed shadowing

Pitch: IDE could notify you with a confirmation dialog, or a less intrusive inline warning, or a short-lived highlight.

Discussions

allevato · March 4, 2019, 1:41am

There's a second case: when referring to a property or method of self inside an escaping closure.

paiv · March 4, 2019, 1:58am

I've stated my case about closures in 3.2

allevato · March 4, 2019, 2:04am

I actually didn’t realize that putting an implicitly strong self in the capture list would also suppress the warning! I’d be surprised if this is widely known since I don’t believe I’ve ever seen it in Swift code in the wild. (Indeed, capture lists in general aside from weak/unowned self tend to be uncommon.)

If you want that pattern to take off, I think you’d want more than just the linter involved here; the compiler diagnostics heavily encourage users to write explicit self. member references instead.

paiv · March 4, 2019, 2:08am

That's why it is marked with Pitch – to change today's views and update compiler implementation.

allevato · March 4, 2019, 2:11am

Ah, that’s why I was confused—you may want to move this to Evolution > Pitches. As it is right now, it’s labeled “[Style]” and in the Using Swift forum so it looks like you’re merely debating style issues.

paiv · March 4, 2019, 2:16am

This is to debate style issues. I haven't seen strong argument for explicit self yet. I'm uncovering compiler issues along the way.

matux · March 4, 2019, 4:07am

Repetition forms habit, my nana used to say.

Whether it is writing code or reading it, repetition taps into the procedural memory mechanism of your brain becoming unconscious. Sooner or later, you'll become unaware that "self" is even there. This can be a great thing or the exact opposite of what you want.

There's something very cool about procedural memory: If instead of mechanical, the repetition is fueled by willingness, reasoning and fun, it forms intuition.

The goal, then, is to make complex semantics simple enough that a programmer will be willing (eager even) to reason about complex problems and develop intuition, instead of doing whatever the compiler tells you to do to make your thing work, and end up developing a mechanical habit.

A compiler who has to remind intermediate/veteran Swift programmers of a missing self. or a missing @escaping, or whatever, is a problem in the language. I see this happen all the time. To coworkers, to friends and to my face. And the response to it never is "Phew!", it is "Ugh." (sometimes followed by a random expletive.)

• The optionality of self is what the problem is.

It introduces semantic ambiguity between a complex subject you don't face often (memory management in ARC), and a very simple and familiar one ("reaching a member of an object").

As much as I love Swift, this is kind of a major "wat" to me.

Especially if you're a newcomer trying something inane like updating a label inside an async closure: "Implicit use of self, use self to make closure capture semantics expli" huh? what? wait, there's a fix it button. Whatever, fix it.

This almost guarantees users won't even understand why they should reason about what this is about. It's a nuisance. Even worse, some could end up forming a habit of ignoring the problem till they can feel it's fangs in their butt cheeks.

Two (hopefully) solutions are:

Do the exact opposite: Disallow the usage of self. everywhere it's optional except when it is obligatory. This would make the semantics around it a bit more explicit, without making it repetitive enough as to end up becoming ignored noise.
Explicit ownership semantics ala Rust. Which is the current direction in which Swift is headed, as far as I can remember from https://github.com/apple/swift/blob/master/docs/OwnershipManifesto.md. And the best answer we have to this problem so far, in my arrogant opinion.

Btw, code that is easier to reason and code with bad noise to signal ratio is a dichotomy.

TellowKrinkle · March 4, 2019, 7:30am

I don't think we should increase the amount of required explicit self. I would be in favor of allowing [self] capture lists to remove the need to use explicit self within a closure.

I do like a lot of the possible IDE feature ideas though, a highlight flash of all the variables whose references were just changed by a paste, comment, or deletion would be great.

I also think a special case compiler warning that catches assigning self-capturing closures or functions to ivars might be helpful, though it might be good to first search through a bunch of existing Swift code to find cases where someone uses it but it's actually okay (gets reassigned afterwards) to see if that's something commonly done (I would think not but who knows).

I also agree that more explicit ownership would be one of the best solutions to this, one of my biggest annoyances with the explicit self rules is if you try to do something like this and have to use explicit self because the compiler doesn't notice that your closures don't live beyond the line:

let thing = Array(collection.lazy.filter{ $0 < self.a }.map{ $0 * self.b })

johannesweiss · March 4, 2019, 9:08pm

Yes, please. Explicit self. is in my opinion very incredibly important because otherwise I find it nearly impossible to code-review. I understand we can’t change the language to require it but it should definitely be at least recommended. How else can I know from a GitHub diff is something’s a property access or not?

The “self. becomes noise” argument I really can’t follow. I use self. as a search term in Xcode and I explicitly look for self. in code reviews. In SwiftNIO explicit self. is required as a coding standard (not yet enforced sadly) so I can be reasonably sure I don’t miss anything but I’d like to 100% sure.

The pitch describes the reasons for explicit self. very well. I want to add that local reasoning is very important and having explicit self. makes that possible because it’s a signal that we can no longer reason locally, everything that uses self. affects more than just the local function. Also: in complex library/framework code it’s often important to know when state is reconciled because if there’s any potential for re-entrancy, state needs to be reconciled before calling out to the user (as the user might re-enter the same method). I think a very high number of SwiftNIO bugs happened because we called into user code before reconciling state on some object. And state modification on an object is much easier spotted when self. is explicit. All assignments that aren’t to self.xyz are mostly local variables and hence much less dangerous.

Today my strategy often is to make a given part of a method temporarily a static func which requires me to list all the “inputs” explicitly as function parameters, all that just to be sure nothing uses self. without me noticing...

Jon_Shier · March 4, 2019, 9:59pm

@matux's post is a good argument as to why it's noise: if it's always there then the user reading the code has no idea when it's actually important. Most of the time with local state like that it doesn't really matter where the value comes from, since you really only have two possibilities: local, which should be visible; and instance, and it should clear from context that it's not local. However, in some cases I guess, like SwiftNIO, it might be important given the fragility of local state. Which is why it should probably be a configurable part of a built in formatter.

In general explicit self seems a lot like explicit strong in Obj-C properties: as the default, it can serve to illustrate that "this is the default" but really is just noise since nearly everything uses the default value.

paiv · March 4, 2019, 10:24pm

If you are reviewing logic, look at the whole context, review full function body, not only the changes introduced.

On the project as fundamental as NIO, I'd expect TDD style, and more focus on tests.

svanimpe · March 4, 2019, 10:24pm

A simple type, without explicit self:

struct Vector {

  var x: Double
  var y: Double

  var length: Double {
    get {
      return sqrt(x * x + y * y)
    }
    set {
      x = (x / length) * newValue
      y = (y / length) * newValue
    }
  }
}

With explicit self:

struct Vector {

  var x: Double
  var y: Double

  var length: Double {
    get {
      return sqrt(self.x * self.x + self.y * self.y)
    }
    set {
      self.x = (self.x / self.length) * newValue
      self.y = (self.y / self.length) * newValue
    }
  }
}

For me, the second version is unreadable :( Perhaps it's the repetition, or that self is a reserved word and therefor highlighted, but the only information that reaches my brain after scanning that code is "self self self self self ...". Reading the second version for me is like trying to talk to someone while someone else is yelling in your face.

I assume I'm not a special snowflake and that many others have the same issue.

I really hope this discussion won't be reopened and that the rationale for rejecting SE-9 applies to the style guide as well.

phoneyDev · March 5, 2019, 1:24am

To me self. used when not required is definitely noise. Code readability has become a pet peeve of mine in recent years. I don't like long methods. I don't like a bunch of boiler plate code in a large method that should be tucked away in short one-purpose methods I don't like unclear variable names and I don't like reading anything that isn't required to understand the code. I place unnecessary self. in the category of something extra to read that provides me no additional semantic meaning.

The usual response to using self. to mark ivar usage is that if you can't tell if a variable is local or an ivar then your code has bigger problems.

The one case I'm on the fence about a little is marking usage of inherited ivars. If you press me I'll say don't do it but I kind of understand why some people do it.

paiv · March 5, 2019, 6:56am

In case others missed it, GitHub lets you expand context in diffs:

johannesweiss · March 5, 2019, 2:16pm

But to a reviewer on GitHub without the explicit self. it's unclear. Also as I pointed out: You can search for self. but you can't if omitted.

But in multi-threaded and re-entrant code, instance properties might change whilst you're in some local scope, so personally everything that's an instance variable always reads like 'danger here' to me but only if I can actually see it (self.).

Maybe it's because I don't work on UI heavy code bases, maybe self. is much more common in every other line there?

johannesweiss · March 5, 2019, 2:17pm

There's a lot of expanding to do there but most importantly it's so easy to miss an instance variable access lacking the self..

johannesweiss · March 5, 2019, 2:45pm

That's exactly what we do, I would consider NIO very well unit-tested but make yourself an opinion. We consider unit tests to be incredibly important and we won't merge without decent tests. However tests and TDD only gets you so far. Even 100% test coverage means very little in reality as with concurrent and re-entrant code lots of state can change in lots of places, very often without you expecting that to happen.

For example take this silly piece of code (totally made up):

class FileOpener {
    var fileDescriptor = Int?
    weak var delegate: FileOpenerDelegate? = nil

    func openFile() throws {
        guard self.fileDescriptor == nil else {
            return
        }
        if let fd = open(....) // actually open the file {
            self.delegate?.fileOpener(self, didOpenFile: ...)
            self.fileDescriptor = fd
        }
    }

    func closeFile() throws { ... }
}

looks pretty harmless, getting 100% test coverage trivial but you'd still likely miss an important but subtle bug: self.fileDescriptor = fd is done too late (after calling out). If the self.delegate?.fileOpener(..., didOpenFile: ...) happens to somehow call openFile on the same object again, you're in trouble because self.fileDescriptor is still nil (but it shouldn't) and we'd open the file again, leaking a file descriptor.

The way I'd review code like this is to find the callouts (search for self.delegate.) and look if there's any state possibly modified after a callout has been done (search for self\..* = in the lines following self.delegate.). If there's any such modification after the callout, then very likely we have a bug here. Quite straightforward in a review with self., very hard to do without...

Testing all the various situations is usually impossible because there's infinitely many things that can be done during a callout so you can't just test for everything that might happen.

Interestingly, the 'yelling in the face' is exactly what I want. In your particular example: It's a value type that doesn't call out, the 'yelling in the face' in indeed just noise as the properties can't change 'under the hood' because mutating functions in structs own the struct exclusively and if you violate it, Swift will crash the program for you.

For everything that's a reference type (class) though or code that calls out to the user, or maybe passed self as an inout, then I believe you really need the 'yelling in the face' to call out a property access as what it is: a side-effect which might get you different data from last time you used that variable, often the source of TOCTOU security vulnerabilities.

gwendal.roue · March 5, 2019, 2:59pm

These are really interesting code review stories. I see and understand how "mechanical" checks, purely based on syntax, can help. I pretty much like syntax-based safety myself, because it avoids needless thinking. Thanks for taking the time to share your techniques.

Now, may I suggest a solution that would please everybody? Not really in the context of this thread, because it talks about "recommended practices" which are highly subjective, as we can see, and quite unlikely to settle on a one-size-fits-all conclusion, but in the context of the current Standard Style Guide and Formatter thread:

When reformatting, preserve existing self.

This would allow your team's practices to keep on flying.

This would also leave all those pieces of code intact:

init(area: Int) {
    // Can't avoid `self.`
    self.area = area
}

init(width: Int, height: Int) {
    // My team doesn't want `self.`
    area = width * height
}

init(width: Int, height: Int) {
    // My team wants `self.`
    self.area = width * height
}

johannesweiss · March 5, 2019, 3:01pm

, probably the only viable solution. There can also be more than one tool: A code formatter without configuration (like go-fmt) that does indentation, spacing, etc and a code linter that has configuration and checks things like explicit self..