Mutability for Foundation types in Swift

Charles_Srstka · April 23, 2016, 3:48am

That’s interesting; I hadn’t known that. What causes that? My understanding had always been that the NS and CF objects, being toll-free bridged to each other, shared the same default implementations, with the only difference being that the NS versions involved the overhead from objc_msgSend() as well as, in many cases, an autorelease.

Charles

···

On Apr 22, 2016, at 5:05 PM, Greg Parker <gparker@apple.com> wrote:

On Apr 22, 2016, at 2:36 PM, Charles Srstka via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:

One comment:

"In the most common case where a developer does not provide a custom reference type, then the backing store is our existing NSData and NSMutableData implementations. This consolidates logic into one place and provides cheap bridging in many cases (see Bridging for more information).”

Would it not be more efficient to bridge to the C-based CFData and CFMutableData implementations instead, to avoid the object overhead?

Not necessarily. Foundation often has less overhead than CF nowadays.

Tony_Parker · April 25, 2016, 4:20pm

Hi Brent,

Thanks for your feedback! You’ve got some great questions below, I’ll try to answer them as best I can.

We took this feedback seriously, and I would like to share with you the start of an important journey for some of the most commonly used APIs on all of our platforms: adopting value semantics for key Foundation types.

This proposal is way cool, and I think you've selected a great starting set of APIs.

Of the other candidate APIs you mentioned, I'm definitely looking forward to AttributedString and some parts of the NSURL Loading system (primarily the requests and responses; the connections and sessions should probably be object types). Predicate, OrderedSet, CountedSet, and possibly Number and Value make sense as well.

However, I think that Locale, Progress, Operation, Calendar, and Port are poor candidates for becoming value types, because they represent specific resources which may be partially outside of your thread's/process's control; that is, they're rather like NS/UIView, in that they represent a specific, identifiable "thing" which cannot be copied without losing some aspect of its meaning.

Agreed, that is why they are at the bottom in the “not value types” section. =)

(I also see little value in an Error type; honestly, in the long run I'd like to see Swift-facing Foundation grow towards pretending that NSError doesn't exist and only ErrorProtocol does.)

ErrorProtocol is a kind-of-NSError today, as the compiler magically generates the two NSError primitives for you (code and domain), but does not expose this anywhere to make it possible to do the correct thing w.r.t. localized error messages.

Anyway this is a complicated topic, which is why I deferred it to another proposal.

The following code is a more natural match for the way Swift developers would expect this to work:

  var myDate = Date()
  myDate.addTimeInterval(60) // OK

  let myOtherDate = Date()
  myOtherDate.addTimeInterval(60) // Error, as expected

The semantic is definitely more what Swift users expect, but the name may not be. As far as I can tell, this method should be called `add`, with `TimeInterval` elided under the "omit needless words" rule of Swift API translation. (Or just call it `+=`, of course…)

The actual API uses +=, this is just an example to prove a point.

URL NSURL
URLComponents NSURLComponents

Have you considered unifying these? NSURL has no mutable counterpart, but NSURLComponents sort of informally plays that role. As it is, it seems like URL would not really be able to support mutation very well—the operations would all be nonmutating under the hood.

The pattern here is that URLComponents properties are basically the arguments to a factory that creates URL. They aren’t really the same thing as a URL itself. Also, like DateComponents or PersonNameComponents, you may want to represent a “partial URL” as some kind of intermediate object, in which case a URLComponents would be the right type because a URL should be valid.

Using Swift structures for our smallest types can be as effective as using tagged pointers in Objective-C.

Have you looked at the performance and size of your value type designs when they're made Optional? In the general case, Optional works by adding a tag byte after the wrapped instance, but for reference types it uses the 0x0 address instead. Thus, depending on how they're designed and where Swift is clever enough to find memory savings, these value types might end up taking more memory than the corresponding reference types when made Optional, particularly if the reference types have tagged pointer representations.

That’s true, but not all reference types wind up as tagged pointers either. NSString, for example, does it based on the content and length of the C string.

I’m ok with the tradeoff here for the few types we have in this category (primarily Date).

if !isUniquelyReferencedNonObjC(&_box) {

Something I have often wondered about: Why doesn't `isUniquelyReferenced(_:)` use `-retainCount` on Objective-C objects? Alternatively, why not use `-retainCount` on fields in your value types when you're trying to implement COW behavior? It seems like that would allow you to extend the copy-on-write mechanism to Objective-C objects. I know that `-retainCount` is usually not very useful, but surely this copy-on-write situation, where you are using it in an advisory fashion and an overestimated retain count will simply cause you to unnecessarily lose some efficiency, is the exception to the rule?

There are likely good reasons for this decision—they simply aren't obvious, and I'd like to understand them.

--
Brent Royal-Gordon
Architechies

There are a few reasons for this, but the one that sticks out most to me is that in Objective-C, retain, release, and retainCount don’t include weak references. If you take a look at the internals for the swift retain count function, you’ll see that there are two: owned and unowned.

- Tony

···

On Apr 24, 2016, at 3:44 AM, Brent Royal-Gordon <brent@architechies.com> wrote:

Tony_Parker · April 25, 2016, 5:55pm

Hi Ben,

<https://github.com/apple/swift-evolution/blob/master/proposals/
0069-swift-mutability-for-foundation.md>

The proposal looks great.

## Introduction

Broken link:

-- <https://developer.apple.com/videos/play/wwdc2015-414/>
++ <WWDC15 - Videos - Apple Developer;

Thanks; I’ll fix that.

### New Value Types

CharacterSet:

* Rename to UnicodeScalarSet?

We made a decision to leave the names of the types the same between Swift and Foundation. It’s a tradeoff for sure, but it seems better than other alternatives. Consistent documentation and hindering a common understanding of purpose for the type would be the biggest challenge if we change the names.

* Update APIs to follow SE-0059 (SetAlgebra) proposal?

We’ll make sure they match in the implementation.

* Add `enumerateRanges` method, similar to NSIndexSet?

Not a bad idea. We’ll consider new API separately from the main thrust of the proposal (its transition to a value type in the first place).

Thanks!
- Tony

···

On Apr 24, 2016, at 3:51 PM, Ben Rimmington via swift-evolution <swift-evolution@swift.org> wrote:

-- Ben

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

Douglas_Gregor · April 26, 2016, 4:44am

There aren’t enough +1s in the world for this, I fully endorse your proposal and would like to subscribe to your newsletter ;)

Do you envision the apinotes will be the vehicle for performing the bridging since ObjectiveCBridgeable was deferred? I actually haven’t checked if that was merged but left as a private protocol or if it still only works in collections.

_ObjectiveCBridgeable is still there, and despite the underscore and the fact that it doesn’t match the interface in the deferred proposal, it’s essentially fully implemented. The Clang side (swift_bridge attribute) is in swift-clang, and there is API notes support for adding it without modifying headers.

- Doug

···

On Apr 25, 2016, at 8:39 PM, Russ Bishop via swift-evolution <swift-evolution@swift.org> wrote:

Russ

On Apr 22, 2016, at 10:18 AM, Tony Parker via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:

Dear swift-evolution denizens,

As you know from our announcement of Swift Open Source and our work on naming guidelines, one of our goals for Swift 3 is to “drop NS” for Foundation. We want to to make the cross-platform Foundation API that is available as part of swift-corelibs feel like it is not tied to just Darwin targets. We also want to reinforce the idea that new Foundation API must fit in with the language, standard library, and the rapidly evolving design patterns we see in the community.

You challenged us on one part of this plan: some Foundation API just doesn’t “feel Swifty”, and a large part of the reason why is that it often does not have the same value type behavior as other Swift types. We took this feedback seriously, and I would like to share with you the start of an important journey for some of the most commonly used APIs on all of our platforms: adopting value semantics for key Foundation types.

We have been working on this for some time now, and the set of diffs that result from this change is large. At this point, I am going to focus effort on an overview of the high level goals and not the individual API of each new type. In order to focus on delivering something up to our quality standards, we are intentionally leaving some class types as-is until a future proposal. If you don’t see your favorite class on the list — don’t despair. We are going to iterate on this over time. I see this as the start of the process.

One process note: we are still trying to figure out the best way to integrate changes to API that ship as part of the operating system (which includes Foundation) into the swift-evolution review process. Swift-evolution is normally focused on changes to functionality in the compiler or standard library. In general, I don’t expect all new Foundation API introduced in the Darwin/Objective-C framework to go through the open source process. However, as we’ve brought up this topic here before, I felt it was important to bring this particular change to the swift-evolution list.

As always I welcome your feedback.

https://github.com/apple/swift-evolution/blob/master/proposals/0069-swift-mutability-for-foundation.md

Thanks,
- Tony

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org <mailto:swift-evolution@swift.org>
https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

Tony_Parker · April 22, 2016, 9:15pm

Hi Riley,

Very happy to see this proposal; felt strange that for a language so focused on value-types an entire framework open sourced with the language was composed entirely of reference-types (albeit for obvious reasons). So +1 for that.

One particular section that caught my interest was this:

The most obvious drawback to using a struct is that the type can no longer be subclassed. At first glance, this would seem to prevent the customization of behavior of these types. However, by publicizing the reference type and providing a mechanism to wrap it (mySubclassInstance as ValueType), we enable subclasses to provide customized behavior.

I'm incredibly biased, but I recently proposed and submitted a pull request that would introduce "factory initializers" to the language (https://github.com/apple/swift-evolution/pull/247\). The full proposal has more info, but essentially factory initializers would allow for directly returning initialized types from designated factory initializers, similar to how initializers are implemented in Objective-C.

Anyway, I feel the Factory Initializer proposal would work very well with this Foundation proposal. While I believe the current suggestion of casting the reference type as the value type works well, I don't believe it is necessarily the client of the API's job to use it; I believe it would make far more sense for there to be an extension adding additional factory initializers to the class, which would determine the underlying reference type to use based on the input parameters.

For example, here is the example of using a custom subclass for the Data type mentioned in this Foundation proposal:

/// Create a Data with a custom backing reference type.
class MyData : NSData { }
let dataReference = MyData()
let dataValue = dataReference as Data // dataValue copies dataReference

I personally would rather see something akin to this:

public extension Data {
  factory init(inputData: ...)
  {
    if ... {
      // Return subclass best suited for storing this particular input data
      return MyData(inputData) as Data
    }
    else {
      let data = NSData()

      /* OMITTED: add hypothetical inputData to NSData depending on what it is */

      return data
}

This means the client of the API never has to worry about which subclass is best suited for them; everything would "just work". This also better mimics the existing class cluster pattern in Foundation, which might help with this transition should my proposal be accepted.

Regardless though, very happy to see this being pushed forward. Just thought I'd suggest ways to make this proposal (hopefully) easier to both implement and use :)

Thanks for your feedback.

For what it’s worth, I’m fully in support of your factory type proposal as well. I think we need it in order to finish a complete implementation of swift-corelibs-foundation, at the very least.

We can certainly extend these types to include use of the factory types once we get them into the language.

- Tony

···

On Apr 22, 2016, at 1:34 PM, Riley Testut <rileytestut@gmail.com> wrote:

On Apr 22, 2016, at 12:52 PM, Tony Parker via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:

Hi David,

On Apr 22, 2016, at 12:13 PM, David Waite <david@alkaline-solutions.com <mailto:david@alkaline-solutions.com>> wrote:

Amazing, I am really looking forward to this feature!

Comments:

- For Locale and Calendar, one possible Swift layout would be to synthesize a protocol and to use that to represent bridged API. You could then bridge inbound to either the immutable value type or the dynamic class-based type. On the swift side, these are constructed as two distinct types.

That’s an interesting approach, I’ll consider that for these.

- For any of these types, are there improvements (similar to String) which would be worth making before exposing ’the’ Swift type and API? The ones I’m specifically worried about are Date and URL, since I’ve seen so many standard language time and networking API show their age over time.

-DW

We’re absolutely going to be making Swift-specific improvements to many of these types. I think the resulting API is better in many ways. For example, on URL the main improvement is that the resource values dictionary is now struct type with a lot of strongly-typed properties. It’s still got a lot of optionals because of the way that the underlying fetch works, but it’s better. Date gains mutating methods along with support for operators like += and < >.

One of the guiding principles of our effort was evolution over revolution. Foundation is obviously used in tons and tons of API. We want to maintain conceptual compatibility with the entire OS X / iOS / watchOS / tvOS SDK when it is imported into Swift. Hopefully this also means that converting from reference to value types in your own uses of these API does not require a complete rethink of how you use them, but still provide the benefits outlined in the proposal. We’ll continue to iterate and improve over time.

Thanks,

- Tony

On Apr 22, 2016, at 11:18 AM, Tony Parker via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:

Dear swift-evolution denizens,

As you know from our announcement of Swift Open Source and our work on naming guidelines, one of our goals for Swift 3 is to “drop NS” for Foundation. We want to to make the cross-platform Foundation API that is available as part of swift-corelibs feel like it is not tied to just Darwin targets. We also want to reinforce the idea that new Foundation API must fit in with the language, standard library, and the rapidly evolving design patterns we see in the community.

You challenged us on one part of this plan: some Foundation API just doesn’t “feel Swifty”, and a large part of the reason why is that it often does not have the same value type behavior as other Swift types. We took this feedback seriously, and I would like to share with you the start of an important journey for some of the most commonly used APIs on all of our platforms: adopting value semantics for key Foundation types.

We have been working on this for some time now, and the set of diffs that result from this change is large. At this point, I am going to focus effort on an overview of the high level goals and not the individual API of each new type. In order to focus on delivering something up to our quality standards, we are intentionally leaving some class types as-is until a future proposal. If you don’t see your favorite class on the list — don’t despair. We are going to iterate on this over time. I see this as the start of the process.

One process note: we are still trying to figure out the best way to integrate changes to API that ship as part of the operating system (which includes Foundation) into the swift-evolution review process. Swift-evolution is normally focused on changes to functionality in the compiler or standard library. In general, I don’t expect all new Foundation API introduced in the Darwin/Objective-C framework to go through the open source process. However, as we’ve brought up this topic here before, I felt it was important to bring this particular change to the swift-evolution list.

As always I welcome your feedback.

https://github.com/apple/swift-evolution/blob/master/proposals/0069-swift-mutability-for-foundation.md

Thanks,
- Tony

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org <mailto:swift-evolution@swift.org>
https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org <mailto:swift-evolution@swift.org>
https://lists.swift.org/mailman/listinfo/swift-evolution

David_Smith · April 23, 2016, 4:20am

One comment:

"In the most common case where a developer does not provide a custom reference type, then the backing store is our existing NSData and NSMutableData implementations. This consolidates logic into one place and provides cheap bridging in many cases (see Bridging for more information).”

Would it not be more efficient to bridge to the C-based CFData and CFMutableData implementations instead, to avoid the object overhead?

Not necessarily. Foundation often has less overhead than CF nowadays.

That’s interesting; I hadn’t known that. What causes that? My understanding had always been that the NS and CF objects, being toll-free bridged to each other, shared the same default implementations, with the only difference being that the NS versions involved the overhead from objc_msgSend() as well as, in many cases, an autorelease.

Charles

There's a wide variety of bridging techniques in use, but in NSData's case the implementations are separate (and there are 5 implementations for NSData, 1 for CFData). CFData also has to pay the cost for detecting whether it's argument is a bridged NSData, which is ironically about as expensive as a message send.

David

···

On Apr 22, 2016, at 8:48 PM, Charles Srstka via swift-evolution <swift-evolution@swift.org> wrote:

On Apr 22, 2016, at 5:05 PM, Greg Parker <gparker@apple.com> wrote:
On Apr 22, 2016, at 2:36 PM, Charles Srstka via swift-evolution <swift-evolution@swift.org> wrote:

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

Douglas_Gregor · April 25, 2016, 5:32pm

Right. I went down this rabbit hole a few weeks ago, trying to determine if we could make an “isUniquelyReferenced” that works for Objective-C-defined classes. One obvious issue is that you can’t always trust retainCount to do the right thing for an arbitrary Objective-C class, because it may have been overridden. We can probably say “don’t do that” and get away with it, but that brings us to Tony’s point: Objective-C weak references are stored in a separate side table, so we can’t atomically determine whether an Objective-C class is uniquely referenced. On platforms that have a non-pointer “isa” we could make this work through the inline reference count (which requires changes to the Objective-C runtime and therefore wouldn’t support backward deployment), but that still doesn’t give us “isUniquelyReferenced” for Objective-C classes everywhere.

Interestingly, while Swift’s object layout gives us the ability to consider the weak reference count, the Swift runtime currently does not do so. IIRC, part of our reasoning was that isUniquelyReferencedNonObjC is generally there to implement copy-on-write, where weak references aren’t actually interesting. However, we weren’t comfortable enough in that logic to commit to excluding weak references from isUniquelyReferencedNonObjC “forever", and certainly aren’t comfortable enough in that reasoning to enshrine “excluding weak references” as part of the semantics of isUniquelyReference for Objective-C classes.

- Doug

···

On Apr 25, 2016, at 9:20 AM, Tony Parker via swift-evolution <swift-evolution@swift.org> wrote:

Hi Brent,

Thanks for your feedback! You’ve got some great questions below, I’ll try to answer them as best I can.

On Apr 24, 2016, at 3:44 AM, Brent Royal-Gordon <brent@architechies.com> wrote:

We took this feedback seriously, and I would like to share with you the start of an important journey for some of the most commonly used APIs on all of our platforms: adopting value semantics for key Foundation types.

This proposal is way cool, and I think you've selected a great starting set of APIs.

Of the other candidate APIs you mentioned, I'm definitely looking forward to AttributedString and some parts of the NSURL Loading system (primarily the requests and responses; the connections and sessions should probably be object types). Predicate, OrderedSet, CountedSet, and possibly Number and Value make sense as well.

However, I think that Locale, Progress, Operation, Calendar, and Port are poor candidates for becoming value types, because they represent specific resources which may be partially outside of your thread's/process's control; that is, they're rather like NS/UIView, in that they represent a specific, identifiable "thing" which cannot be copied without losing some aspect of its meaning.

if !isUniquelyReferencedNonObjC(&_box) {

Something I have often wondered about: Why doesn't `isUniquelyReferenced(_:)` use `-retainCount` on Objective-C objects? Alternatively, why not use `-retainCount` on fields in your value types when you're trying to implement COW behavior? It seems like that would allow you to extend the copy-on-write mechanism to Objective-C objects. I know that `-retainCount` is usually not very useful, but surely this copy-on-write situation, where you are using it in an advisory fashion and an overestimated retain count will simply cause you to unnecessarily lose some efficiency, is the exception to the rule?

There are likely good reasons for this decision—they simply aren't obvious, and I'd like to understand them.

--
Brent Royal-Gordon
Architechies

There are a few reasons for this, but the one that sticks out most to me is that in Objective-C, retain, release, and retainCount don’t include weak references. If you take a look at the internals for the swift retain count function, you’ll see that there are two: owned and unowned.

John_McCall · April 25, 2016, 8:58pm

It's arguably not necessary to do this check atomically with respect to concurrent attempts to retain/release:
  - If somebody is racing to increase the reference count to 2, they must be copying this variable.
  - If somebody is racing to increase the reference count to > 2, our reference is non-unique either way.
  - If somebody is racing to decrease the reference count to 1, then we either do an over-safe copy or we get lucky avoiding it.
  - If somebody is racing to decrease the reference count to 1, they must be mutating this variable.
We only do this uniqueness check when mutating the variable, so a concurrent attempt to copy or mutate it is an illegal read/write or write/write race.

Weak references are potentially a different story, both semantically and in terms of concurrency. There could be an outstanding weak reference that somebody could be concurrently loading (and hence retaining), breaking the assumption in the first bullet above. The only way to fix that is to treat weak references as making strong references non-unique, and AFAIK we have no way of detecting the non-existence of weak references on existing operating systems. But it's not clear to me that we need to give this much weight to weak references; after all, it's equally possible to form unsafe references to existing objects, which is completely undetectable, and we've traditionally described that as just an unsafe weak reference.

You have to come up with fairly contrived use cases for weak references to see problems with just ignoring them, like a memoization table that maintains both strong and weak references but drops the strong references when there's memory pressure. In this case, our value type might end up with the last strong reference, and changing the object in-place would corrupt the cached value. But this would be a really odd way to implement a memoization table, basically just using weak references for their own sake.

John.

···

On Apr 25, 2016, at 10:32 AM, Douglas Gregor via swift-evolution <swift-evolution@swift.org> wrote:

On Apr 25, 2016, at 9:20 AM, Tony Parker via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:

Hi Brent,

Thanks for your feedback! You’ve got some great questions below, I’ll try to answer them as best I can.

On Apr 24, 2016, at 3:44 AM, Brent Royal-Gordon <brent@architechies.com <mailto:brent@architechies.com>> wrote:

We took this feedback seriously, and I would like to share with you the start of an important journey for some of the most commonly used APIs on all of our platforms: adopting value semantics for key Foundation types.

This proposal is way cool, and I think you've selected a great starting set of APIs.

Of the other candidate APIs you mentioned, I'm definitely looking forward to AttributedString and some parts of the NSURL Loading system (primarily the requests and responses; the connections and sessions should probably be object types). Predicate, OrderedSet, CountedSet, and possibly Number and Value make sense as well.

However, I think that Locale, Progress, Operation, Calendar, and Port are poor candidates for becoming value types, because they represent specific resources which may be partially outside of your thread's/process's control; that is, they're rather like NS/UIView, in that they represent a specific, identifiable "thing" which cannot be copied without losing some aspect of its meaning.

if !isUniquelyReferencedNonObjC(&_box) {

Something I have often wondered about: Why doesn't `isUniquelyReferenced(_:)` use `-retainCount` on Objective-C objects? Alternatively, why not use `-retainCount` on fields in your value types when you're trying to implement COW behavior? It seems like that would allow you to extend the copy-on-write mechanism to Objective-C objects. I know that `-retainCount` is usually not very useful, but surely this copy-on-write situation, where you are using it in an advisory fashion and an overestimated retain count will simply cause you to unnecessarily lose some efficiency, is the exception to the rule?

There are likely good reasons for this decision—they simply aren't obvious, and I'd like to understand them.

--
Brent Royal-Gordon
Architechies

There are a few reasons for this, but the one that sticks out most to me is that in Objective-C, retain, release, and retainCount don’t include weak references. If you take a look at the internals for the swift retain count function, you’ll see that there are two: owned and unowned.

Right. I went down this rabbit hole a few weeks ago, trying to determine if we could make an “isUniquelyReferenced” that works for Objective-C-defined classes. One obvious issue is that you can’t always trust retainCount to do the right thing for an arbitrary Objective-C class, because it may have been overridden. We can probably say “don’t do that” and get away with it, but that brings us to Tony’s point: Objective-C weak references are stored in a separate side table, so we can’t atomically determine whether an Objective-C class is uniquely referenced. On platforms that have a non-pointer “isa” we could make this work through the inline reference count (which requires changes to the Objective-C runtime and therefore wouldn’t support backward deployment), but that still doesn’t give us “isUniquelyReferenced” for Objective-C classes everywhere.