Allow stored properties in extensions

I don't think there's a formal correctness issue with modifying the rules about what changes are ABI compatible, but we'd want to be very careful about any such changes to avoid 'pulling the rug' out from under anyone who has been relying on certain transformations being ABI-compatible. IMO it would be best to keep the policies the same and just disable this functionality for @frozen structs.

IMO, the sorts of properties you describe should absolutely not be part of the synthesized conformance to the protocols mentioned. If users want those properties included they should be required to implement Equatable et al manually, in which case the problem of determining the behavior of the conformance is solved manually because it is written out explicitly.

This appears to be essentially the same rationale by which synthesized Equatable and Hashable conformance was restricted to conformance declarations on the main type. The restriction of course has since been expanded to extensions in the same file.

As you know, the same history is applicable to the private scope, which was eventually expanded so that members are accessible in all same-file extensions.

Over and over again, we are rediscovering the file scope as the meaningful sub-modular unit by which to divide code, even as many first attempts try to eschew it. I think it is fair to say that it has stood the test of time.

Given that it is possible to declare synthesized Equatable and Hashable conformances in same-file extensions, this promise as understood in that manner was broken from the get-go, was it not?

Sorry, I see your meaning, and that is a good point about extension members being reordered in the same file—this could the addressed by always laying out members declared in extensions in alphabetical rather than source order, rather than banning this for resilient types outright, methinks?

1 Like

This is compelling, but I'm not sure I quite agree that these two examples are analogous with the case at hand. In the case of Equatable/Hashable there were compelling functional reasons to allow the conformance synthesis with same-file extensions:

Synthesis is supported in same-file extensions to ensure that generic types can synthesize a conditional conformance, since the properties may only satisfy the requirements for synthesis (see below) with extra bounds:

struct Bad<T>: Equatable { // synthesis not possible, T is not Equatable
    var x: T
}

struct Good<T> {
   var x: T
}
extension Good: Equatable where T: Equatable {} // synthesis works, T is Equatable

And as far as same-file-extension-private-access goes, I don't view the downsides of that decision as being quite as problematic as they are here. Answering the question "what are all the stored properties of this type?" becomes nontrivially harder under this proposal, but answering the question "where does this particular private member get used?" amounts to a 'find' operation within the file with or without same-file-extension access.

I'm mostly focusing on the "re-ordered" part of that sentence, which AFAICT is not violated by synthesized Equatable/Hashable conformances on same-file extensions. Point taken that the "moved between source files" is perhaps not strictly true thanks to such conformances, but at least in that situation the failure mode is a compilation failure and not an ABI break.

EDIT: oop, wires crossed!

Yes, I think this would work, though I'd probably err on the side of disallowing it at first and seeing if we really feel like we need it for @frozen structs. Also, with unicode-supported source files, we'd have to make sure that we define 'alphabetical' in a way that would not change based on the unicode version that the source file is compiled against. :slightly_smiling_face: I have a vague memory that Swift already has some odd behavior around unicode-equivalent identifiers in source?

2 Likes

Agreed, that makes sense to me.

Totally agree, this is a compelling reason to keep this functionality limited to a single file and not the entire module.

I don't think this is a good idea, for a few reasons:

Primarily, I feel like it trades a win for a loss in a way that is at best a wash, and probably a net loss. While it is nice to declare variables locally to their use, it's also nice to go to a type and answer, in one place, "what is this thing". That can be done if you declare all the storage in one place.

My preference for this comes from working on the standard library, back in the days when significant refactors were frequent. I would often find myself wanting to go to a type and ask "what is this thing made up of". But I couldn't: the type's storage declarations were scattered throughout the file in exactly the way proposed here – Stored properties were declared just above where they were first used by some methods. Note some methods, because often that storage was then used much later down in the file too, after some other unrelated stuff. This worked because of another stylistic approach at the time, which was to declare (nearly) all the protocols a type conformed to at the top, instead of grouping them later with extensions.

I found this so confusing that I went through and reorganized most types in the standard library to do two things:

  1. Declare the minimum possible things (really just stored properties and some type aliases and the most fundamental of inits) and then close the definition.
  2. Everything else, either protocol conformances, inner type definitions, or just other methods, were grouped together into short extensions.

Having done this, the standard library code became far easier to work on, at least for me.

Now, I realize this proposal is in part motivated by #2 here. If you break conformances up into extensions, and just declare storage with the extension that needs it, then this makes those extensions more coherent.

But I think it would be a big loss, at least for the standard library code, to not be able to go to the struct definition and see "what is this type made out of – what is the essense of this thing". This is a really important question to answer, especially for low-level code. Is Array just a pointer to a buffer? I can tell you that easily because the declaration of Array is nice and minimal.[1] Same goes for Slice, or String or String.Index.

I admit this is a stylistic preference, but I'd really push to keep any codebase written in this style. I don't think we want to offer more choice in this area.

I might be biased by the fact that I don't tend to work on much business logic code, where perhaps storage required by conformances is more common than in the std lib. But I would also fear that on a sprawling codebase, making storage declaration easier would lead to bloated types where you don't even realize how big the types are getting. And it's also the case that often, storage might be primarily associated with some protocol conformance, but also touched in other places. And when that happens, you're back to the question "what is this type and what does it contain" and I wouldn't want that smeared all over the file.

Secondly (and probably much less importantly) I think this will cause confusion when people try to do this outside the file. A frequently requested feature is to be able to add storage to protocols, or to add storage in extensions even outside the type's module. This is often coming from folks used to a language where everything is a pointer, and they're looking for for sugar for some kind of global lookup table based on the class reference (asking "how would adding storage to Int work?" usually helps). By allowing the syntax in a specific circumstance, you're likely going to upset people hoping for this feature even more.


  1. OK not as nice and minimal as you might wish if you work on it... but it used to be so much worse. ↩︎

39 Likes

This is very well put indeed. I tend to do the same in my projects for big types: only vars and init in main type body, everything else in extensions, which has an added bonus that those extensions could be moved to separate files (aside the infamous problems with overexposing of member visibility). To keep things semi structured I comment the relevant section appropriately:

struct Foo {
    var x: Int
    var y: Int

    // MARK: Foo.Protocol name, or an "informal" extension name, or a file name if extension is in a file
    var z: Int
}

here z is perhaps only used in the relevant extension / file, by specifying the extension / file name it is easier to find such dependencies.

There's also a potential collision course with the "strict conformances" feature we may have one day, a variation of which was pitched recently. In the strictest form it is "a conformance extension only has protocol conformance and nothing else".

It seems to me that the objections/concerns* apply most strongly to structs, and conversely that the points in favor are most relevant for classes. I'd suggest limiting the feature to classes, at least initially.


*Which I agreed with; I'm not particularly enthusiastic about this idea.

I think that extensions are not for decomposition, but for convenience methods and properties. As splitting the state is clearly the part of decomposition process, I think it is not reasonable to support the bad practice of fake extensions-based decomposition by adding suitable language means for that.
More specifically, stored properties in extensions indicate the low cohesion of the entity.
In the given example, AstronomicalObject should not be really a protocol but a superclass, because it is not specifying any common behavior, but is specifying the "contents" of the entity. Terraform method is quite alarming here already because majority of astronomical objects completely not applicable for that (by the way, that is a classic problem of the inheritance).
So you should either declare this "fundamental" protocol in the declaration of the class (as preferred with SwiftUI views for example) or rethink your protocol as there would clearly be dozens of other properties of astronomical objects and your protocol (or superclass) would grow endlessly.

2 Likes

I’m strongly against this. Being able to look at the main definition of a type and know that it contains all stored properties of that type is a feature, not a bug. And it isn’t just important for low-level programming like @Ben_Cohen mentions—it’s about understanding what states the type can be in. That’s equally important for business logic as it is for low-level programming.

9 Likes

I'm surprised to see so many reactions against this idea. I've frequently wished for the ability to add stored properties in extensions, as a way to incrementally make complex code simpler.

A lot of the arguments against this seem to center around the idea that it's important to be able to see all the states an object is in. That would make sense if the current limitation forced all stored properties to be declared together… but it does not. You can already write code that scatters property declarations and functions, so long as you don't use extensions. You can't assume that the list of vars at the top constitutes the full state of the type; you already have to search the rest of the type definition to make sure you haven't missed any. Tooling could make this easier by extracting all the variable definitions for you, but it could do that even if they were declared in extensions too.

You can, of course, do what @Ben_Cohen suggests and close the type definition right after you've declared all the variables, and then implement all the functions in extensions. But that can have additional consequences; for example, functions declared in a class extension cannot override functions declared in a superclass (and cannot be overridden by a subclass), so if a function is an override (or is intended to be overridden), it must be in the primary type declaration. That means that you can't always keep your primary type declaration short and sweet, with all the variable declarations grouped together and all the functions elsewhere.

It would be great if we could all write code populated entirely with pure value types and small, self-contained type definitions, but that's not always feasible. If you're implementing a UIViewController subclass, for example, you have no hope of understanding all the states your type can be in, because UIViewController (at least as of iOS 10) had over 150 stored properties, in addition to whatever your own subclass was going to add.

So yeah, I'd love to be able to incrementally improve this code by moving things into extensions. Right now, that's not possible, but I'd love if it were. I agree that it's often more useful for classes than for structs, but I don't think it would make sense to restrict it only to classes. This feature would be welcome to me. It would be even more welcome if it were possible to override a superclass function from within an extension, so long as it was also in the same file as the type declaration, but I'm happy to accept incremental progress.

14 Likes

that said, i wouldn’t exactly hold up UIViewController as an example of something to emulate

I agree, but there are a lot of Swift users out there with a lot of existing UIViewController subclasses out there. Should we ignore the large base of existing users we have, in hopes for other ones who won't use such messy subclasses?

1 Like

I'm pretty sympathetic to concerns around keeping code well-organized, since poorly organized code is more difficult to understand. I also think that this is contextual, and can go both ways.

For example, not all functionality in a type is necessarily "core" to its definition. A desirable protocol conformance can include required members that are more incidental, and specifically related to the protocol rather than a core functionality of the type. Being required to define these properties in the main type definition, rather than in the related extension, can make the code less well-organized.

As an example that's a bit less contrived, take this pattern for applying a piece of state to a object:

protocol SetState {
  associatedtype State
  func update(to state: State)
}

As an optimization to avoid redundant work, we may wish to avoid calling this method repeatedly with the same value. We could provide this behavior by default, as long as we had state to track this in:

protocol SetState {
  associatedtype State: Equatable
  func update(to state: State)

  var previouslyAppliedState: State? { get set }
}

extension SetState {
  // Updates the object to the given state, if it isn't already in that state
  func updateIfNecessary(to state: State) {
    if previouslyAppliedState != state {
      previouslyAppliedState = state
      update(to: state)
    }
  }
}

This previouslyAppliedState stored property requirement is mostly an incidental implementation detail of the protocol, and not something that I would consider "core" to the functionality of any type that implements the protocol. In fact, as a consumer of the type, I don't even really need to know or care that this stored property exists.

In this case it seems much better to define the stored property in the same extension as the other properties:

class MyComponent {
  ... "core" stored properties

  ... main methods
}

extension MyComponent: SetState {
  var previouslyAppliedState: ComponentState?

  func update(to: ComponentState) {
    ...
  }
}

rather than define it in the main body of the type with all of the other stored properties:

class MyComponent {
  ... "core" stored properties

  // Implements `SetState` requirement
  var previouslyAppliedState: ComponentState?

  ... main methods
}

extension MyComponent: SetState {
  func update(to: ComponentState) {
    ...
  }
}
1 Like

Even if those extensions were in different files.. interesting angle :thinking:. Of course we don't use IDE's always (sometimes it's reviewing pull requests with some web based tool or even a terminal, etc).

Quite right. Another example would be var properties with didSet. To keep those cleaner you can split them up, but that's an additional complexity to worry about:

struct Foo {
    var property: Int {
        didSet { propertyChanged() }
    }
    ...
}

extension Foo {
    func propertyChanged() {
        ... many lines of code here...
    }
}
1 Like

I wonder if there’s a middle ground where we make it clear that the “what is this thing” properties must continue to live in the main definition while “auxiliary,” non-essential pieces of state can live in same-file extensions… we could, for example, keep it so that synthesis of Equatable et al only considers stored properties in the main type definition. Of course, that would mean that in the following example:

struct S {
  var x: Int
}
extension S: Equatable {
  var y: Int
}

Only x would be used for Equatable synthesis, which seems not great.

2 Likes

Agreed that this seems a bit too surprising, particularly in the example you shared. I think we would want to avoid having different "classifications" of stored properties. One problem with that is that moving code between the main definition and extension would continue compiling but silently change the behavior of the code.

2 Likes

I would agree that we definitely don’t want that.

If it is considered wise to allow stored properties to be broken up into extensions, then one would expect them all to be considered for Equatable; if it is not wise, then I would expect the whole proposal to be rejected.

I don’t think the answer to the complexity of having stored properties in extensions is to add further complexity by making them a “second tier” of stored properties with different behavior in certain circumstances.

5 Likes

Perhaps a genuinely useful distinction here is whether or not the property needs to appear in the struct's memberwise initializer. I would expect "core" properties of a type to appear in the memberwise initializer, but other incidental properties wouldn't necessarily need to be included. This is the case for both of the non-contrived examples I shared above, and may actually be a pretty natural solution for the tension here.

If we only permitted this for default-initialized properties, such that they weren't included the synthesized memberwise initializer, would that also help prevent "misuse" of this feature?

Another question: is the issue this pitch identifies with the existing solution to the conformance problem:

struct Planet {
  let name: String
  let atmosphere: Atmosphere?
  // Implements AstronomicalObject:
  var mass: Double
  let parent: AstronomicalObject?
}

extension Planet: AstronomicalObject {
  func terraform() { /* add an atmosphere */ }
}

that mass and parent must appear in the main Planet definition, or that they don't appear in the extension that declares the AstronomicalObject conformance? IOW, would another resolution be to allow the following?

struct Planet {
  let name: String
  let atmosphere: Atmosphere?
  // Implements AstronomicalObject:
  var mass: Double
  let parent: AstronomicalObject?
}

extension Planet: AstronomicalObject {
  // No-op redeclarations, perhaps must be witnesses for `AstronomicalObject` requirements
  // or referenced by other members in the extension?
  var mass: Double
  let parent: AstronomicalObject?
  func terraform() { /* add an atmosphere */ }
}
3 Likes

I really think same-file extensions already got way more special-case support from the language than they deserve, but there is an option that would actually simplify the rules:
Allow stored properties to be declared in extensions — without restrictions.

There's a price to pay for that flexibility (at least when the type comes from another module), but I don't think that penalty is big enough to completely exclude this alternative from the discussion.