Pitch: Much better-performing, Swift-native Progress replacement


(Charles Srstka) #1

I’ve been working on a replacement for Progress (NSProgress) for use in my own code for a while now. It hasn’t been extensively battle-tested yet, but for the time being it seems to work. The full code for it is at https://github.com/CharlesJS/CSProgress (BSD licensed, with the exception that Apple/the Swift team is granted permission to treat it as public-domain code if they so choose).

A rather detailed description of this class and its motivations are in the read-me for the linked repository, and copying the whole thing would probably run afoul of the mailing list’s size limit. However, these are the highlights:

- Much better performance. In my testing, updating the completedUnitProperty, with something observing it, takes 1.04 seconds on my 2013 Retinabook, compared with NSProgress’s 52.91 seconds (58.53 seconds if an autorelease pool is used). This frees back-end code from the responsibility of being adulterated with what is effectively UI code in order to determine when the progress object should be updated.

- Much better RAM usage. NSProgress generates a large number of autoreleased objects, which can cause considerable memory bloat if they are not wrapped in an autorelease pool, which further hinders performance. By using Swift-native types whenever available, this problem is avoided.

- Better thread safety, by using atomic operations to update the completedUnitCount property. NSProgress requires taking the lock twice, once to read the old value, and once to write the new value. CSProgress reduces this by allowing the delta to be sent, meanwhile solving the race condition that can occur if something else updates the property in between the two operations.

- Includes a “granularity” property, customizable by the user, which determines how often fractionCompleted notifications will be sent. The default is 0.01, which means that fractionCompleted notifications will be sent whenever fractionCompleted becomes at least 0.01 greater than its value at the time the last notification was sent. For a steady increase from 0 to totalUnitCount, the notification will be sent 100 times. This prevents unnecessary UI updates when the progress is incremented by a negligible amount.

- Much better and “Swiftier” interface:
  - All observations are done via closures, rather than KVO.
  
  - All observations are sent via an OperationQueue which is specifiable by the user, and which defaults to the main queue. This means that
    UI elements can be directly set from within the notification closures.

  - Unlike the closure-based APIs on NSProgress such as cancellationHandler, CSProgress allows multiple closures to be specified for each observation type, in case a progress object has more than one client that wants to observe it.

  - Uses generics for all functions taking unit counts, so that any integer can be passed to them instead of having to cast everything to Int64.

  - Adds a wrapper struct encapsulating both a parent progress and a pending unit count, so that explicit tree composition can be used without
          requiring functions to know the pending unit counts of their parents, thus preserving the loose coupling inherent to the old-fashioned implicit style.

- Bridging to the standard Objective-C Progress/NSProgress. This code was added the most recently, and is the most hackish by far, so it will need a lot of testing, but for the time being it seems to pass my unit tests. The bridging is designed to allow CSProgress to insert itself into any part of an NSProgress tree, being able to handle NSProgress objects as parents or as children. Updates to parent NSProgresses are also done on a user-specified OperationQueue, which defaults to the main queue. This means that if you prefer NSProgress’s KVO-based API, you can actually bind your UI elements directly to the NSProgress, and since all updates to the NSProgress will happen on the main queue, you won’t need any special property observers to forward notifications to the main thread.

I am wondering what the community thinks of this. I’ve tried to keep the interface as close to Progress/NSProgress’s as possible, to allow this to be a drop-in replacement. There are a few features of NSProgress that are still unimplemented; however, I could add them if the response is positive. With the interface mirroring Progress’s, this means that, with some tweaking and extensive battle testing, CSProgress could potentially serve as a replacement for Progress, with the original Progress being renamed back to NSProgress. The result would break binary compatibility, but could be source-compatible, and I think it has the potential to improve the experience on Swift.

What do you think?

Charles


(Charles Srstka) #2

This should have said “updating the completedUnitProperty one million times, with something observing it.” I eagerly await the implementation of our new forum, hopefully with an edit feature, with bated breath.

Charles

···

On Feb 17, 2017, at 1:42 PM, Charles Srstka via swift-evolution <swift-evolution@swift.org> wrote:

- Much better performance. In my testing, updating the completedUnitProperty, with something observing it, takes 1.04 seconds on my 2013 Retinabook, compared with NSProgress’s 52.91 seconds (58.53 seconds if an autorelease pool is used). This frees back-end code from the responsibility of being adulterated with what is effectively UI code in order to determine when the progress object should be updated.


(Rod Brown) #3

I applaud the idea. I too find the (NS)Progress API to be very low quality. It seems a vestige of an earlier time when Cocoa was young and APIs that seem like they should be simple, just... aren't. I would love to see a much better API developed.

I'm curious how this idea of developing something from the ground up works with Apple's preferred idea of using Swift to bring Foundation forward.

-Rod

···

On 18 Feb 2017, at 6:48 am, Charles Srstka via swift-evolution <swift-evolution@swift.org> wrote:

On Feb 17, 2017, at 1:42 PM, Charles Srstka via swift-evolution <swift-evolution@swift.org> wrote:

- Much better performance. In my testing, updating the completedUnitProperty, with something observing it, takes 1.04 seconds on my 2013 Retinabook, compared with NSProgress’s 52.91 seconds (58.53 seconds if an autorelease pool is used). This frees back-end code from the responsibility of being adulterated with what is effectively UI code in order to determine when the progress object should be updated.

This should have said “updating the completedUnitProperty one million times, with something observing it.” I eagerly await the implementation of our new forum, hopefully with an edit feature, with bated breath.

Charles

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


(Charles Srstka) #4

There are several ground-up replacements in the Swift overlay already, if you look at the various value types that have been introduced to replace various NS classes (granted, some of them call through to the underlying Foundation API for their implementation, but others don’t). As long as we eventually end up supporting NSProgress’s interface, something like this should be possible to drop in as long as we do it before the ABI stability lockdown.

In any event, I BSD-licensed the code, so it’s free to use whether or not Apple ends up using it, so feel free to give it a try (for performance testing, be sure to compile in Release mode, as it does much better with the optimizations on). If you find any bugs or problems, please let me know.

Thanks,
Charles

···

On Feb 21, 2017, at 5:24 AM, Rod Brown <rodney.brown6@icloud.com> wrote:

I applaud the idea. I too find the (NS)Progress API to be very low quality. It seems a vestige of an earlier time when Cocoa was young and APIs that seem like they should be simple, just... aren't. I would love to see a much better API developed.

I'm curious how this idea of developing something from the ground up works with Apple's preferred idea of using Swift to bring Foundation forward.


(Rod Brown) #5

My concern regarding a new class in the overlay is interoperability.

With a lot of things in the Swift Overlay, identity isn't relevant. For example, we turn a lot of Objective-C classes into structs because their identity is not relevant, and they should be copied anyway, so it makes sense to let the Swift world be different.

Progress however has innate identity for tracking purposes. Separate Progresses for Swift and Obj-C causes problems. If I want to pass my progress to an Obj-C API, I can't hand it my Swift progress, I must hand it an Obj-C one.

I can't foresee how a new Progress will integrate cleanly with existing NSProgress API. Is it worthwhile creating a separate class that is incompatible with Obj-C Foundation?

···

On 22 Feb 2017, at 1:40 am, Charles Srstka <cocoadev@charlessoft.com> wrote:
On Feb 21, 2017, at 5:24 AM, Rod Brown <rodney.brown6@icloud.com> wrote:

I applaud the idea. I too find the (NS)Progress API to be very low quality. It seems a vestige of an earlier time when Cocoa was young and APIs that seem like they should be simple, just... aren't. I would love to see a much better API developed.

I'm curious how this idea of developing something from the ground up works with Apple's preferred idea of using Swift to bring Foundation forward.

There are several ground-up replacements in the Swift overlay already, if you look at the various value types that have been introduced to replace various NS classes (granted, some of them call through to the underlying Foundation API for their implementation, but others don’t). As long as we eventually end up supporting NSProgress’s interface, something like this should be possible to drop in as long as we do it before the ABI stability lockdown.

In any event, I BSD-licensed the code, so it’s free to use whether or not Apple ends up using it, so feel free to give it a try (for performance testing, be sure to compile in Release mode, as it does much better with the optimizations on). If you find any bugs or problems, please let me know.

Thanks,
Charles


(Rod Brown) #6

Ah sorry! To be honest, I didn't see the part about your interoperability with NSProgress trees!

That said, it does seem to be a hacky solution. A better solution would be a backing NSProgress that is linked and lazily created when casts occur, perhaps? This would map far cleaner to current bridging techniques in the Swift Overlay?

Sorry, I haven't looked at how your internal implementation is done. I'm just going off the description in your email. I'm curious to get involved in creating a better progress API if we can find a way that is superior and still cleanly interoperates.

Rod

···

On 22 Feb 2017, at 7:43 am, Rod Brown via swift-evolution <swift-evolution@swift.org> wrote:

My concern regarding a new class in the overlay is interoperability.

With a lot of things in the Swift Overlay, identity isn't relevant. For example, we turn a lot of Objective-C classes into structs because their identity is not relevant, and they should be copied anyway, so it makes sense to let the Swift world be different.

Progress however has innate identity for tracking purposes. Separate Progresses for Swift and Obj-C causes problems. If I want to pass my progress to an Obj-C API, I can't hand it my Swift progress, I must hand it an Obj-C one.

I can't foresee how a new Progress will integrate cleanly with existing NSProgress API. Is it worthwhile creating a separate class that is incompatible with Obj-C Foundation?

On 22 Feb 2017, at 1:40 am, Charles Srstka <cocoadev@charlessoft.com> wrote:

On Feb 21, 2017, at 5:24 AM, Rod Brown <rodney.brown6@icloud.com> wrote:

I applaud the idea. I too find the (NS)Progress API to be very low quality. It seems a vestige of an earlier time when Cocoa was young and APIs that seem like they should be simple, just... aren't. I would love to see a much better API developed.

I'm curious how this idea of developing something from the ground up works with Apple's preferred idea of using Swift to bring Foundation forward.

There are several ground-up replacements in the Swift overlay already, if you look at the various value types that have been introduced to replace various NS classes (granted, some of them call through to the underlying Foundation API for their implementation, but others don’t). As long as we eventually end up supporting NSProgress’s interface, something like this should be possible to drop in as long as we do it before the ABI stability lockdown.

In any event, I BSD-licensed the code, so it’s free to use whether or not Apple ends up using it, so feel free to give it a try (for performance testing, be sure to compile in Release mode, as it does much better with the optimizations on). If you find any bugs or problems, please let me know.

Thanks,
Charles

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


(Charles Srstka) #7

Agreed that it’s hacky, which is why I tried to do my best to separate the Objective-C compatibility crud from the rest of the code, so it can be easily yanked and replaced with something better if we think of one.

With that said, the way I’m currently doing it is inspired by the way the Data<->NSData bridging works. My class, CSProgress, has a “backing” property. This property stores an enum with two cases, one for Swift, and one for Objective-C. The Swift backing contains the basic implementation, and the Objective-C one wraps an NSProgress, forwarding all changes to it and registering for KVO notifications on it for the purpose of keeping its own clients notified. Each case contains an associated type that handles all the primitives. There’s also a private NSProgress subclass (unimaginatively named “BridgedNSProgress”) that wraps a CSProgress, and forwards NSProgress changes to it.

What you get if you:

- create a CSProgress from scratch, or from another CSProgress: A CSProgress, with the Swift backing.

- bridge an NSProgress to a CSProgress: If the NSProgress was a BridgedNSProgress, you get the CSProgress it was wrapping. Otherwise, you get a CSProgress with the Objective-C backing.

- bridge a CSProgress to an NSProgress: If the CSProgress uses the Objective-C backing, you get the NSProgress it was wrapping. Otherwise, you get a BridgedNSProgress wrapping the CSProgress.

If no bridging occurs and you’re using CSProgress all the way down, no NSProgress objects get involved.

Charles

···

On Feb 21, 2017, at 2:52 PM, Rod Brown <rodney.brown6@icloud.com> wrote:

Ah sorry! To be honest, I didn't see the part about your interoperability with NSProgress trees!

That said, it does seem to be a hacky solution. A better solution would be a backing NSProgress that is linked and lazily created when casts occur, perhaps? This would map far cleaner to current bridging techniques in the Swift Overlay?

Sorry, I haven't looked at how your internal implementation is done. I'm just going off the description in your email. I'm curious to get involved in creating a better progress API if we can find a way that is superior and still cleanly interoperates.


(Rod Brown) #8

It's an interesting implementation, from my cursory look. Inspired by the bridging wrappers that the rest of the Swift Overlay uses, and with a massive performance win in Swift-only uses. Awesome!

It still holds the fundamental oddities of NSProgress that I find weird. Like "current progress" always strikes me as very odd concept.

I'd be interested to see what Tony Parker and the Core Team think!

···

On 22 Feb 2017, at 8:22 am, Charles Srstka <cocoadev@charlessoft.com> wrote:

On Feb 21, 2017, at 2:52 PM, Rod Brown <rodney.brown6@icloud.com> wrote:

Ah sorry! To be honest, I didn't see the part about your interoperability with NSProgress trees!

That said, it does seem to be a hacky solution. A better solution would be a backing NSProgress that is linked and lazily created when casts occur, perhaps? This would map far cleaner to current bridging techniques in the Swift Overlay?

Sorry, I haven't looked at how your internal implementation is done. I'm just going off the description in your email. I'm curious to get involved in creating a better progress API if we can find a way that is superior and still cleanly interoperates.

Agreed that it’s hacky, which is why I tried to do my best to separate the Objective-C compatibility crud from the rest of the code, so it can be easily yanked and replaced with something better if we think of one.

With that said, the way I’m currently doing it is inspired by the way the Data<->NSData bridging works. My class, CSProgress, has a “backing” property. This property stores an enum with two cases, one for Swift, and one for Objective-C. The Swift backing contains the basic implementation, and the Objective-C one wraps an NSProgress, forwarding all changes to it and registering for KVO notifications on it for the purpose of keeping its own clients notified. Each case contains an associated type that handles all the primitives. There’s also a private NSProgress subclass (unimaginatively named “BridgedNSProgress”) that wraps a CSProgress, and forwards NSProgress changes to it.

What you get if you:

- create a CSProgress from scratch, or from another CSProgress: A CSProgress, with the Swift backing.

- bridge an NSProgress to a CSProgress: If the NSProgress was a BridgedNSProgress, you get the CSProgress it was wrapping. Otherwise, you get a CSProgress with the Objective-C backing.

- bridge a CSProgress to an NSProgress: If the CSProgress uses the Objective-C backing, you get the NSProgress it was wrapping. Otherwise, you get a BridgedNSProgress wrapping the CSProgress.

If no bridging occurs and you’re using CSProgress all the way down, no NSProgress objects get involved.

Charles


(Charles Srstka) #9

It still holds the fundamental oddities of NSProgress that I find weird. Like "current progress" always strikes me as very odd concept.

The “current progress” bit is there mainly for source compatibility. I’d expect it would not be the norm for new code.

I'd be interested to see what Tony Parker and the Core Team think!

So would I.

Charles

···

On Feb 21, 2017, at 3:52 PM, Rod Brown <rodney.brown6@icloud.com> wrote:


(Tony Parker) #10

It seems like the main complaints about NSProgress revolve around KVO, which there is no question about not being right for Swift in many ways. I’m aware of that. I think the right answer there is instead to improve KVO in Swift, not to replace all KVO in the SDK on a case-by-case basis with ad-hoc observation mechanisms. I acknowledge that improving KVO is not a small task.

Responding to some of the other notes in your description:

* KVO

NSProgress does not use KVO to update its parent progress objects. You can actually see this in the swift-corelibs-foundation version of NSProgress. It does post notifications for its properties this way though.

* Implicit tree composition

I agree that implicit tree composition is not a great idea except in very controlled circumstances. That’s why I introduced the explicit API.

* Explicit tree composition

It looks like you’ve used this incorrectly. The reason the ProgressReporting protocol exists is for types to expose a progress that the caller can then compose into their own progress tree. There is no requirement to use it, but it establishes the pattern.

// disclaimer: typed in Mail
class X {
    var progress: Progress {
        let p = Progress.discreteProgress(totalUnitCount: 10)
        // start with some progress
        p.completedUnitCount = 5
    }
}

var x = X()
var p = Progress.discreteProgress(totalUnitCount: 2)
var childA = Progress(totalUnitCount: 4, parent: p, pendingUnitCount: 1) // childA is 50% of p
p.addChild(x.progress, pendingUnitCount: 1) // x.progress is 50% of p

p.fractionCompleted // 0.25

* Updating progress in a tight loop

No matter how efficient you make updating the properties (and, again, I acknowledge that KVO support adds a cost here), the fact that progress forms trees and the trees are designed to be arbitrarily large, means that you should always consider the cost of updating too frequently. Reducing the cost per update is still a noble goal. As is reducing autoreleases.

* Updating completedUnitCount atomically

The best practice here is to keep the progress object thread local. I think that updating one progress from multiple threads could be a code smell. Perhaps then you are doing several parts of the work and you should instead form a tree. This also leads in a direction where completed unit count is either 100% handed out to children or 100% controlled by your work-update-progress loop. Mixing the two leads to easy confusion about who is responsible for which portion. If you follow this rule, you never have to get the completed unit count, which means the race you describe does not exist.

···

On Feb 21, 2017, at 2:25 PM, Charles Srstka via swift-evolution <swift-evolution@swift.org> wrote:

On Feb 21, 2017, at 3:52 PM, Rod Brown <rodney.brown6@icloud.com <mailto:rodney.brown6@icloud.com>> wrote:

It still holds the fundamental oddities of NSProgress that I find weird. Like "current progress" always strikes me as very odd concept.

The “current progress” bit is there mainly for source compatibility. I’d expect it would not be the norm for new code.

I'd be interested to see what Tony Parker and the Core Team think!

So would I.

Charles

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


(Charles Srstka) #11

It seems like the main complaints about NSProgress revolve around KVO, which there is no question about not being right for Swift in many ways. I’m aware of that. I think the right answer there is instead to improve KVO in Swift, not to replace all KVO in the SDK on a case-by-case basis with ad-hoc observation mechanisms. I acknowledge that improving KVO is not a small task.

I am not proposing, and would never propose, replacing all KVO in the entire SDK. I do, however, believe that KVO is not the correct mechanism to use for progress reporting, a use case which incurs all of KVO’s drawbacks, while taking advantage of none of its benefits. The main advantage of KVO is in reducing glue code, allowing you to bind UI elements directly to your model in IB without having to set up the connections manually. Unfortunately, since UI elements must be updated from the main thread, and NSProgress is likely to be used on secondary threads to avoid blocking the UI, this advantage is nullified, since custom code must be written to receive KVO notifications and do work on the main thread. In fact, NSProgress’s use of KVO requires you to write much *more* glue code than otherwise, since the code needed to observe a KVO property manually is much longer, uglier, more error-prone, and generally more painful than a simple closure-based notification. Therefore, the use of KVO in NSProgress strikes me as akin to using a hammer to drive in a screw. There are better tools for this particular task.

Responding to some of the other notes in your description:

* KVO

NSProgress does not use KVO to update its parent progress objects. You can actually see this in the swift-corelibs-foundation version of NSProgress. It does post notifications for its properties this way though.

It does not use KVO to update the parent, but it does cause its parent, and its grandparent, and its great-grandparent, etc. all to post KVO notifications going all the way back up to the root of the tree. That is a lot of notifications.

* Implicit tree composition

I agree that implicit tree composition is not a great idea except in very controlled circumstances. That’s why I introduced the explicit API.

I also agree; unfortunately, the explicit API as it currently exists is not ideal for procedural tasks (see below).

* Explicit tree composition

It looks like you’ve used this incorrectly. The reason the ProgressReporting protocol exists is for types to expose a progress that the caller can then compose into their own progress tree. There is no requirement to use it, but it establishes the pattern.

// disclaimer: typed in Mail
class X {
    var progress: Progress {
        let p = Progress.discreteProgress(totalUnitCount: 10)
        // start with some progress
        p.completedUnitCount = 5
    }
}

var x = X()
var p = Progress.discreteProgress(totalUnitCount: 2)
var childA = Progress(totalUnitCount: 4, parent: p, pendingUnitCount: 1) // childA is 50% of p
p.addChild(x.progress, pendingUnitCount: 1) // x.progress is 50% of p

p.fractionCompleted // 0.25

Using ProgressReporting requires each subtask to be wrapped in a class (each of which needs to be Objective-C, to boot). When performing complex operations with many steps, it does not help:

class SomeClass {
  …

  func doSomething {
    self.subTaskA()
    self.subTaskB()
    self.subTaskC()
  }

  func subTaskA() {
    // do something
  }

  func subTaskB() {
    // do something
  }

  func subTaskC() {
    self.subTaskD()
    self.subTaskE()
  }

  … etc ...
}

Using ProgressReporting for tasks such as this requires that each function be a class, which not only bloats the Objective-C class hierarchy, but also requires substantial boilerplate, since each of those calls now requires three lines of code; one to create the class, one to attach its progress object to yours, and one to actually execute the function. It will also likely perform worse. A more elegant solution, IMO, would allow the sort of loose coupling that the implicit composition method provides, while avoiding the dark magics involved in using the implicit composition method.

* Updating progress in a tight loop

No matter how efficient you make updating the properties (and, again, I acknowledge that KVO support adds a cost here), the fact that progress forms trees and the trees are designed to be arbitrarily large, means that you should always consider the cost of updating too frequently. Reducing the cost per update is still a noble goal. As is reducing autoreleases.

This is true, which is why my CSProgress class includes a property called “granularity”, which causes its notifications to fire only when the delta between the fractionCompleted property and its value at the time of the last fire exceeds some amount. So if the granularity is 0.01 (which is currently my default value for it), and the fraction updates like below, it’ll behave like:

0.0
0.003
0.009
0.012 <- FIRE: 0.012 - 0.0 is greater than 0.01
0.015
0.020
0.023 <- FIRE: 0.023 - 0.012 is greater than 0.01
0.025
0.027

… etc …

This removes the burden of coalescing updates from the worker code, freeing it up to focus on its actual work.

* Updating completedUnitCount atomically

The best practice here is to keep the progress object thread local. I think that updating one progress from multiple threads could be a code smell. Perhaps then you are doing several parts of the work and you should instead form a tree. This also leads in a direction where completed unit count is either 100% handed out to children or 100% controlled by your work-update-progress loop. Mixing the two leads to easy confusion about who is responsible for which portion. If you follow this rule, you never have to get the completed unit count, which means the race you describe does not exist.

NSProgress is designed to be thread-safe by protecting access to its properties with an NSLock, but it is not in fact thread-safe. It is making a promise to its users that it is not keeping. In addition, the lack of an atomic increment means that if you are getting data from an API that only reports deltas, you either have to get the old completedUnitCount, or keep track of the running total yourself. An atomic increment operation would not only improve thread safety, but it would improve the experience, since it would reduce the work the user needs to do, while remaining safe and performant.

Charles

···

On Feb 22, 2017, at 12:52 PM, Tony Parker <anthony.parker@apple.com> wrote:


(Charles Srstka) #12

One thing to add to this is that when dealing with async APIs that you don’t control, you don’t always know that things will be thread-local.

So if some library provides:

public func OpaqueAPI(input: SomeInput, dataHandler: (Data) -> ())

then in your client code:

let progress = Progress.discreteProgress(totalUnitCount: ...)

OpaqueAPI(input: blah) { data in
  // Update my progress by data.count, but I don’t necessarily know what thread this will happen on, or that it will be the same one every time
}

So, now I have to protect either the progress or some running total var that I’m keeping with locks/mutexes/semaphores/sync queues/etc., adding a ton of boilerplate to the method.

Charles

···

On Feb 22, 2017, at 1:41 PM, Charles Srstka via swift-evolution <swift-evolution@swift.org> wrote:

* Updating completedUnitCount atomically

The best practice here is to keep the progress object thread local. I think that updating one progress from multiple threads could be a code smell. Perhaps then you are doing several parts of the work and you should instead form a tree. This also leads in a direction where completed unit count is either 100% handed out to children or 100% controlled by your work-update-progress loop. Mixing the two leads to easy confusion about who is responsible for which portion. If you follow this rule, you never have to get the completed unit count, which means the race you describe does not exist.

NSProgress is designed to be thread-safe by protecting access to its properties with an NSLock, but it is not in fact thread-safe. It is making a promise to its users that it is not keeping. In addition, the lack of an atomic increment means that if you are getting data from an API that only reports deltas, you either have to get the old completedUnitCount, or keep track of the running total yourself. An atomic increment operation would not only improve thread safety, but it would improve the experience, since it would reduce the work the user needs to do, while remaining safe and performant.


(Rod Brown) #13

Charles and Tony,

Would it be more appropriate to consider suggesting that the Foundation team to consider adding a method of choosing their “notification granularity”, and avoid KVO notifications until it meets the granularity requirement, or some other way of minimising the reporting period of the Progress?

I think the big argument from you, Charles, is that the progress can become arbitrarily badly performing, depending on its use, and that could have untold effects up the chain with KVO notifications firing for minor elements of work?

I think this is a fair observation that the responsibility of a reporter shouldn’t be to rate limit the reporting rate for performance reasons. Rather, the observer should be able to adjust the reporting rate to something that is appropriate for the updates they wish to perform. A 50 pixel progress bar only theoretically needs reporting every adjustment of 1/50 of the total work, and other reporting and updating is superfluous and simply a performance problem. But why is a data task responsible for knowing that? It seems backwards.

Perhaps we need to examine the problem here, and how to solve that, rather than the cause? Because I agree with Tony - replacing every use of KVO on a performance basis isn’t a scalable solution.

There are also a few comments inline.

-Rod

It seems like the main complaints about NSProgress revolve around KVO, which there is no question about not being right for Swift in many ways. I’m aware of that. I think the right answer there is instead to improve KVO in Swift, not to replace all KVO in the SDK on a case-by-case basis with ad-hoc observation mechanisms. I acknowledge that improving KVO is not a small task.

I am not proposing, and would never propose, replacing all KVO in the entire SDK. I do, however, believe that KVO is not the correct mechanism to use for progress reporting, a use case which incurs all of KVO’s drawbacks, while taking advantage of none of its benefits. The main advantage of KVO is in reducing glue code, allowing you to bind UI elements directly to your model in IB without having to set up the connections manually. Unfortunately, since UI elements must be updated from the main thread, and NSProgress is likely to be used on secondary threads to avoid blocking the UI, this advantage is nullified, since custom code must be written to receive KVO notifications and do work on the main thread. In fact, NSProgress’s use of KVO requires you to write much *more* glue code than otherwise, since the code needed to observe a KVO property manually is much longer, uglier, more error-prone, and generally more painful than a simple closure-based notification. Therefore, the use of KVO in NSProgress strikes me as akin to using a hammer to drive in a screw. There are better tools for this particular task.

I think the argument for KVO is more that this is the convention for binding UI to data in Obj-C and Cocoa, which would seem consistent. The question is can we find a way to improve this? Is that replacement, or evolution of the API?

Responding to some of the other notes in your description:

* KVO

NSProgress does not use KVO to update its parent progress objects. You can actually see this in the swift-corelibs-foundation version of NSProgress. It does post notifications for its properties this way though.

It does not use KVO to update the parent, but it does cause its parent, and its grandparent, and its great-grandparent, etc. all to post KVO notifications going all the way back up to the root of the tree. That is a lot of notifications.

* Implicit tree composition

I agree that implicit tree composition is not a great idea except in very controlled circumstances. That’s why I introduced the explicit API.

I also agree; unfortunately, the explicit API as it currently exists is not ideal for procedural tasks (see below).

* Explicit tree composition

It looks like you’ve used this incorrectly. The reason the ProgressReporting protocol exists is for types to expose a progress that the caller can then compose into their own progress tree. There is no requirement to use it, but it establishes the pattern.

// disclaimer: typed in Mail
class X {
    var progress: Progress {
        let p = Progress.discreteProgress(totalUnitCount: 10)
        // start with some progress
        p.completedUnitCount = 5
    }
}

var x = X()
var p = Progress.discreteProgress(totalUnitCount: 2)
var childA = Progress(totalUnitCount: 4, parent: p, pendingUnitCount: 1) // childA is 50% of p
p.addChild(x.progress, pendingUnitCount: 1) // x.progress is 50% of p

p.fractionCompleted // 0.25

Using ProgressReporting requires each subtask to be wrapped in a class (each of which needs to be Objective-C, to boot). When performing complex operations with many steps, it does not help:

class SomeClass {
  …

  func doSomething {
    self.subTaskA()
    self.subTaskB()
    self.subTaskC()
  }

  func subTaskA() {
    // do something
  }

  func subTaskB() {
    // do something
  }

  func subTaskC() {
    self.subTaskD()
    self.subTaskE()
  }

  … etc ...
}

Using ProgressReporting for tasks such as this requires that each function be a class, which not only bloats the Objective-C class hierarchy, but also requires substantial boilerplate, since each of those calls now requires three lines of code; one to create the class, one to attach its progress object to yours, and one to actually execute the function. It will also likely perform worse. A more elegant solution, IMO, would allow the sort of loose coupling that the implicit composition method provides, while avoiding the dark magics involved in using the implicit composition method.

* Updating progress in a tight loop

No matter how efficient you make updating the properties (and, again, I acknowledge that KVO support adds a cost here), the fact that progress forms trees and the trees are designed to be arbitrarily large, means that you should always consider the cost of updating too frequently. Reducing the cost per update is still a noble goal. As is reducing autoreleases.

This is true, which is why my CSProgress class includes a property called “granularity”, which causes its notifications to fire only when the delta between the fractionCompleted property and its value at the time of the last fire exceeds some amount. So if the granularity is 0.01 (which is currently my default value for it), and the fraction updates like below, it’ll behave like:

0.0
0.003
0.009
0.012 <- FIRE: 0.012 - 0.0 is greater than 0.01
0.015
0.020
0.023 <- FIRE: 0.023 - 0.012 is greater than 0.01
0.025
0.027

… etc …

This removes the burden of coalescing updates from the worker code, freeing it up to focus on its actual work.

* Updating completedUnitCount atomically

The best practice here is to keep the progress object thread local. I think that updating one progress from multiple threads could be a code smell. Perhaps then you are doing several parts of the work and you should instead form a tree. This also leads in a direction where completed unit count is either 100% handed out to children or 100% controlled by your work-update-progress loop. Mixing the two leads to easy confusion about who is responsible for which portion. If you follow this rule, you never have to get the completed unit count, which means the race you describe does not exist.

NSProgress is designed to be thread-safe by protecting access to its properties with an NSLock, but it is not in fact thread-safe. It is making a promise to its users that it is not keeping. In addition, the lack of an atomic increment means that if you are getting data from an API that only reports deltas, you either have to get the old completedUnitCount, or keep track of the running total yourself. An atomic increment operation would not only improve thread safety, but it would improve the experience, since it would reduce the work the user needs to do, while remaining safe and performant.

I would agree with Tony that this would seem to be a code smell. Updating a single progress from two threads seems like you don’t have a handle on a centralised value for how much work is updated.

···

On 23 Feb 2017, at 6:41 am, Charles Srstka <cocoadev@charlessoft.com> wrote:

On Feb 22, 2017, at 12:52 PM, Tony Parker <anthony.parker@apple.com <mailto:anthony.parker@apple.com>> wrote:

Charles


(Charles Srstka) #14

I think the big argument from you, Charles, is that the progress can become arbitrarily badly performing, depending on its use, and that could have untold effects up the chain with KVO notifications firing for minor elements of work?

I think this is a fair observation that the responsibility of a reporter shouldn’t be to rate limit the reporting rate for performance reasons. Rather, the observer should be able to adjust the reporting rate to something that is appropriate for the updates they wish to perform. A 50 pixel progress bar only theoretically needs reporting every adjustment of 1/50 of the total work, and other reporting and updating is superfluous and simply a performance problem. But why is a data task responsible for knowing that? It seems backwards.

Very much so. It has been my opinion for a long time that the back-end code is *not* the appropriate place for this stuff to be.

Perhaps we need to examine the problem here, and how to solve that, rather than the cause? Because I agree with Tony - replacing every use of KVO on a performance basis isn’t a scalable solution.

Again, I am not advocating for replacing every use of KVO. For many cases, KVO is conceptually a great solution (even if I hate its implementation). For binding values in the UI to properties in your view controller? It’s great. For populating table views with arrays of values, or doing pretty much anything Core Data-related? It’s a godsend. However, for the specific task of progress reporting, IMO, it is Considered Harmful. We cannot reap the benefits of it, because of the threading issue.

Can we bind our UI elements to it to avoid glue code? No, because it might be updated on a background thread.

Okay, can we just be sure to wrap all updates on the main dispatch queue? No, because if we add any sub-progresses to the tree that are managed by opaque library code, we can’t know that the library will follow the same pattern. Also: without some kind of coalescing mechanism, we’ll end up with a crazy number of operations flooding the queue.

Can we fix KVO to make it fire on the main thread? No, because KVO is extensively used all over the Objective-C frameworks, and making such a large change to it is bound to break binary compatibility all over the place. Also, the need for the -willChangeValueForKey: method to complete before changing the property really hamstrings any attempts to make KVO thread-safe in a sane way.

Can we add some Swift-native KVO equivalent, and have that work the way we want it to? Yes, and this would be fantastic! But with Progress/NSProgress being the same class instead of a custom Swift class, we’d need to keep the Objective-C KVO system in place for compatibility with Objective-C code, which means we’ll still have the performance drawbacks of having -willChangeValueForKey: and -didChangeValueForKey: called every time we update it.

Can we just write our own KVO observer, observe the progress, and forward things to the main queue from there? Of course. But the UI for that is terrible compared to anything closure-based, so then why are we using KVO?

My opinion: KVO is cool for the things it’s good at, but it’s the wrong tool here.

Charles

···

On Feb 22, 2017, at 2:05 PM, Rod Brown <rodney.brown6@icloud.com> wrote:


(Rod Brown) #15

I agree with your assessment that KVO may not be the right tool for this because the binding doesn't allow you to declare which thread you'd like the observation posted on. I suspect this design was supposed to be inline with the patterns of Cocoa, despite the fact that the realities of multi-thread access make its use as such somewhat more limited.

Apple themselves seemed to find a workaround by making UIProgressViews on iOS themselves responsibility for handling the KVO and threading considerations. It seems a rather ad hoc solution that shows a distinct issue with the API, and I'm curious why the update wasn't carried to NSProgressIndicator, but it works.

I think the bigger issue is: we have Progress now, and a complicated workaround for a different type of NSProgress replacement has two problems:

1. The optimisations won't carry across to Obj-C code.

2. It seems any update we make to Progress in the way you have would be somewhat hacky, and we would have a very difficult time explaining to users. This doesn't seem like a first class solution, no matter how creative it is. The value/copy semantics make the Struct Variants in the Foundation Overlay clean, but the fact this API is based on identity and encapsulation of state would make this a messy solution.

While KVO is perhaps not the best tool for this, it is the one that exists. Perhaps it's better to work on fixing NSProgress's flaws so everyone can benefit? Adding granularity controls and perhaps(?) a block/closure based callback option might fix the majority of the issues, and have the added benefit of paying dividends to everyone.

Rod

···

On 23 Feb 2017, at 7:39 am, Charles Srstka <cocoadev@charlessoft.com> wrote:

On Feb 22, 2017, at 2:05 PM, Rod Brown <rodney.brown6@icloud.com> wrote:

I think the big argument from you, Charles, is that the progress can become arbitrarily badly performing, depending on its use, and that could have untold effects up the chain with KVO notifications firing for minor elements of work?

I think this is a fair observation that the responsibility of a reporter shouldn’t be to rate limit the reporting rate for performance reasons. Rather, the observer should be able to adjust the reporting rate to something that is appropriate for the updates they wish to perform. A 50 pixel progress bar only theoretically needs reporting every adjustment of 1/50 of the total work, and other reporting and updating is superfluous and simply a performance problem. But why is a data task responsible for knowing that? It seems backwards.

Very much so. It has been my opinion for a long time that the back-end code is *not* the appropriate place for this stuff to be.

Perhaps we need to examine the problem here, and how to solve that, rather than the cause? Because I agree with Tony - replacing every use of KVO on a performance basis isn’t a scalable solution.

Again, I am not advocating for replacing every use of KVO. For many cases, KVO is conceptually a great solution (even if I hate its implementation). For binding values in the UI to properties in your view controller? It’s great. For populating table views with arrays of values, or doing pretty much anything Core Data-related? It’s a godsend. However, for the specific task of progress reporting, IMO, it is Considered Harmful. We cannot reap the benefits of it, because of the threading issue.

Can we bind our UI elements to it to avoid glue code? No, because it might be updated on a background thread.

Okay, can we just be sure to wrap all updates on the main dispatch queue? No, because if we add any sub-progresses to the tree that are managed by opaque library code, we can’t know that the library will follow the same pattern. Also: without some kind of coalescing mechanism, we’ll end up with a crazy number of operations flooding the queue.

Can we fix KVO to make it fire on the main thread? No, because KVO is extensively used all over the Objective-C frameworks, and making such a large change to it is bound to break binary compatibility all over the place. Also, the need for the -willChangeValueForKey: method to complete before changing the property really hamstrings any attempts to make KVO thread-safe in a sane way.

Can we add some Swift-native KVO equivalent, and have that work the way we want it to? Yes, and this would be fantastic! But with Progress/NSProgress being the same class instead of a custom Swift class, we’d need to keep the Objective-C KVO system in place for compatibility with Objective-C code, which means we’ll still have the performance drawbacks of having -willChangeValueForKey: and -didChangeValueForKey: called every time we update it.

Can we just write our own KVO observer, observe the progress, and forward things to the main queue from there? Of course. But the UI for that is terrible compared to anything closure-based, so then why are we using KVO?

My opinion: KVO is cool for the things it’s good at, but it’s the wrong tool here.

Charles