[Style] Explicit self when referencing instance member

I've stated my case about closures in 3.2

I actually didn’t realize that putting an implicitly strong self in the capture list would also suppress the warning! I’d be surprised if this is widely known since I don’t believe I’ve ever seen it in Swift code in the wild. (Indeed, capture lists in general aside from weak/unowned self tend to be uncommon.)

If you want that pattern to take off, I think you’d want more than just the linter involved here; the compiler diagnostics heavily encourage users to write explicit self. member references instead.

That's why it is marked with Pitch – to change today's views and update compiler implementation.

Ah, that’s why I was confused—you may want to move this to Evolution > Pitches. As it is right now, it’s labeled “[Style]” and in the Using Swift forum so it looks like you’re merely debating style issues.

This is to debate style issues. I haven't seen strong argument for explicit self yet. I'm uncovering compiler issues along the way.

Repetition forms habit, my nana used to say.

Whether it is writing code or reading it, repetition taps into the procedural memory mechanism of your brain becoming unconscious. Sooner or later, you'll become unaware that "self" is even there. This can be a great thing or the exact opposite of what you want.

There's something very cool about procedural memory: If instead of mechanical, the repetition is fueled by willingness, reasoning and fun, it forms intuition.

The goal, then, is to make complex semantics simple enough that a programmer will be willing (eager even) to reason about complex problems and develop intuition, instead of doing whatever the compiler tells you to do to make your thing work, and end up developing a mechanical habit.

A compiler who has to remind intermediate/veteran Swift programmers of a missing self. or a missing @escaping, or whatever, is a problem in the language. I see this happen all the time. To coworkers, to friends and to my face. And the response to it never is "Phew!", it is "Ugh." (sometimes followed by a random expletive.)

• The optionality of self is what the problem is.

It introduces semantic ambiguity between a complex subject you don't face often (memory management in ARC), and a very simple and familiar one ("reaching a member of an object").

As much as I love Swift, this is kind of a major "wat" to me.

Especially if you're a newcomer trying something inane like updating a label inside an async closure: "Implicit use of self, use self to make closure capture semantics expli" huh? what? wait, there's a fix it button. Whatever, fix it.

This almost guarantees users won't even understand why they should reason about what this is about. It's a nuisance. Even worse, some could end up forming a habit of ignoring the problem till they can feel it's fangs in their butt cheeks.

Two (hopefully) solutions are:

  1. Do the exact opposite: Disallow the usage of self. everywhere it's optional except when it is obligatory. This would make the semantics around it a bit more explicit, without making it repetitive enough as to end up becoming ignored noise.

  2. Explicit ownership semantics ala Rust. Which is the current direction in which Swift is headed, as far as I can remember from https://github.com/apple/swift/blob/master/docs/OwnershipManifesto.md. And the best answer we have to this problem so far, in my arrogant opinion.

Btw, code that is easier to reason and code with bad noise to signal ratio is a dichotomy.

1 Like

I don't think we should increase the amount of required explicit self. I would be in favor of allowing [self] capture lists to remove the need to use explicit self within a closure.

I do like a lot of the possible IDE feature ideas though, a highlight flash of all the variables whose references were just changed by a paste, comment, or deletion would be great.

I also think a special case compiler warning that catches assigning self-capturing closures or functions to ivars might be helpful, though it might be good to first search through a bunch of existing Swift code to find cases where someone uses it but it's actually okay (gets reassigned afterwards) to see if that's something commonly done (I would think not but who knows).

I also agree that more explicit ownership would be one of the best solutions to this, one of my biggest annoyances with the explicit self rules is if you try to do something like this and have to use explicit self because the compiler doesn't notice that your closures don't live beyond the line:

let thing = Array(collection.lazy.filter{ $0 < self.a }.map{ $0 * self.b })
2 Likes

Yes, please. Explicit self. is in my opinion very incredibly important because otherwise I find it nearly impossible to code-review. I understand we can’t change the language to require it but it should definitely be at least recommended. How else can I know from a GitHub diff is something’s a property access or not?

The “self. becomes noise” argument I really can’t follow. I use self. as a search term in Xcode and I explicitly look for self. in code reviews. In SwiftNIO explicit self. is required as a coding standard (not yet enforced sadly) so I can be reasonably sure I don’t miss anything but I’d like to 100% sure.

The pitch describes the reasons for explicit self. very well. I want to add that local reasoning is very important and having explicit self. makes that possible because it’s a signal that we can no longer reason locally, everything that uses self. affects more than just the local function. Also: in complex library/framework code it’s often important to know when state is reconciled because if there’s any potential for re-entrancy, state needs to be reconciled before calling out to the user (as the user might re-enter the same method). I think a very high number of SwiftNIO bugs happened because we called into user code before reconciling state on some object. And state modification on an object is much easier spotted when self. is explicit. All assignments that aren’t to self.xyz are mostly local variables and hence much less dangerous.

Today my strategy often is to make a given part of a method temporarily a static func which requires me to list all the “inputs” explicitly as function parameters, all that just to be sure nothing uses self. without me noticing...

6 Likes

@matux's post is a good argument as to why it's noise: if it's always there then the user reading the code has no idea when it's actually important. Most of the time with local state like that it doesn't really matter where the value comes from, since you really only have two possibilities: local, which should be visible; and instance, and it should clear from context that it's not local. However, in some cases I guess, like SwiftNIO, it might be important given the fragility of local state. Which is why it should probably be a configurable part of a built in formatter.

In general explicit self seems a lot like explicit strong in Obj-C properties: as the default, it can serve to illustrate that "this is the default" but really is just noise since nearly everything uses the default value.

7 Likes

If you are reviewing logic, look at the whole context, review full function body, not only the changes introduced.

On the project as fundamental as NIO, I'd expect TDD style, and more focus on tests.

1 Like

A simple type, without explicit self:

struct Vector {

  var x: Double
  var y: Double

  var length: Double {
    get {
      return sqrt(x * x + y * y)
    }
    set {
      x = (x / length) * newValue
      y = (y / length) * newValue
    }
  }
}

With explicit self:

struct Vector {

  var x: Double
  var y: Double

  var length: Double {
    get {
      return sqrt(self.x * self.x + self.y * self.y)
    }
    set {
      self.x = (self.x / self.length) * newValue
      self.y = (self.y / self.length) * newValue
    }
  }
}

For me, the second version is unreadable :( Perhaps it's the repetition, or that self is a reserved word and therefor highlighted, but the only information that reaches my brain after scanning that code is "self self self self self ...". Reading the second version for me is like trying to talk to someone while someone else is yelling in your face.

I assume I'm not a special snowflake and that many others have the same issue.

I really hope this discussion won't be reopened and that the rationale for rejecting SE-9 applies to the style guide as well.

17 Likes

To me self. used when not required is definitely noise. Code readability has become a pet peeve of mine in recent years. I don't like long methods. I don't like a bunch of boiler plate code in a large method that should be tucked away in short one-purpose methods I don't like unclear variable names and I don't like reading anything that isn't required to understand the code. I place unnecessary self. in the category of something extra to read that provides me no additional semantic meaning.

The usual response to using self. to mark ivar usage is that if you can't tell if a variable is local or an ivar then your code has bigger problems.

The one case I'm on the fence about a little is marking usage of inherited ivars. If you press me I'll say don't do it but I kind of understand why some people do it.

2 Likes

In case others missed it, GitHub lets you expand context in diffs:

But to a reviewer on GitHub without the explicit self. it's unclear. Also as I pointed out: You can search for self. but you can't if omitted.

But in multi-threaded and re-entrant code, instance properties might change whilst you're in some local scope, so personally everything that's an instance variable always reads like 'danger here' to me but only if I can actually see it (self.).

Maybe it's because I don't work on UI heavy code bases, maybe self. is much more common in every other line there?

There's a lot of expanding to do there but most importantly it's so easy to miss an instance variable access lacking the self..

1 Like

That's exactly what we do, I would consider NIO very well unit-tested but make yourself an opinion. We consider unit tests to be incredibly important and we won't merge without decent tests. However tests and TDD only gets you so far. Even 100% test coverage means very little in reality as with concurrent and re-entrant code lots of state can change in lots of places, very often without you expecting that to happen.

For example take this silly piece of code (totally made up):

class FileOpener {
    var fileDescriptor = Int?
    weak var delegate: FileOpenerDelegate? = nil

    func openFile() throws {
        guard self.fileDescriptor == nil else {
            return
        }
        if let fd = open(....) // actually open the file {
            self.delegate?.fileOpener(self, didOpenFile: ...)
            self.fileDescriptor = fd
        }
    }

    func closeFile() throws { ... }
}

looks pretty harmless, getting 100% test coverage trivial but you'd still likely miss an important but subtle bug: self.fileDescriptor = fd is done too late (after calling out). If the self.delegate?.fileOpener(..., didOpenFile: ...) happens to somehow call openFile on the same object again, you're in trouble because self.fileDescriptor is still nil (but it shouldn't) and we'd open the file again, leaking a file descriptor.

The way I'd review code like this is to find the callouts (search for self.delegate.) and look if there's any state possibly modified after a callout has been done (search for self\..* = in the lines following self.delegate.). If there's any such modification after the callout, then very likely we have a bug here. Quite straightforward in a review with self., very hard to do without...

Testing all the various situations is usually impossible because there's infinitely many things that can be done during a callout so you can't just test for everything that might happen.


Interestingly, the 'yelling in the face' is exactly what I want. In your particular example: It's a value type that doesn't call out, the 'yelling in the face' in indeed just noise as the properties can't change 'under the hood' because mutating functions in structs own the struct exclusively and if you violate it, Swift will crash the program for you.

For everything that's a reference type (class) though or code that calls out to the user, or maybe passed self as an inout, then I believe you really need the 'yelling in the face' to call out a property access as what it is: a side-effect which might get you different data from last time you used that variable, often the source of TOCTOU security vulnerabilities.

3 Likes

These are really interesting code review stories. I see and understand how "mechanical" checks, purely based on syntax, can help. I pretty much like syntax-based safety myself, because it avoids needless thinking. Thanks for taking the time to share your techniques.

Now, may I suggest a solution that would please everybody? Not really in the context of this thread, because it talks about "recommended practices" which are highly subjective, as we can see, and quite unlikely to settle on a one-size-fits-all conclusion, but in the context of the current Standard Style Guide and Formatter thread:

When reformatting, preserve existing self.

This would allow your team's practices to keep on flying.

This would also leave all those pieces of code intact:

init(area: Int) {
    // Can't avoid `self.`
    self.area = area
}

init(width: Int, height: Int) {
    // My team doesn't want `self.`
    area = width * height
}

init(width: Int, height: Int) {
    // My team wants `self.`
    self.area = width * height
}
1 Like

:+1:, probably the only viable solution. There can also be more than one tool: A code formatter without configuration (like go-fmt) that does indentation, spacing, etc and a code linter that has configuration and checks things like explicit self..

I did stop on the code before reading your comment, and pointed the issue right away. But the line I pointed to is the open() call. Is the open() a call-out in your terms? Would you miss it review?

You are not modeling infinitely many things during review either. If the function needs to be reentrant, then are you missing tests for reentrancy?

No, the self.delegate?..... is.

In this particular (made up) example a re-entrancy test is indeed very simple and should be present but in more complex scenarios it's often a huge amount of work (and imagination) to come up with the very scenario that causes issues.
You're right, that review can't model all things either but if I'm confident there's no state mutation after a callout, I'm much less worried about re-entrancy and can move on, I don't need to think about it anymore. If there are local state accesses however I will need to consider re-entrancy. Therefore, for me, it's important to see all local state accesses and I do that by searching for self.. I just haven't found another good enough approximation for code bases without explicit self. yet. That's all :slight_smile: