I think it's even worse than the non-atomicity suggests, because reduce
is modifying these properties asynchronously in response to progress updates from children. So even if my uses of my properties are single-threaded (which the API in no way guarantees), I can still race against my children, whom in general I have no control over.
Another problem with NSProgress
that this proposal also ignores is that of "update frequency". With NSProgress
you were supposed to KVO it to update your UI, but its value could be updated arbitrarily quickly, leading to performance problems. With ProgressReporter
, the problem is punted to the observer. If the observer is always SwiftUI, that might almost be OK; I presume it already throttles updates to the refresh rate. But if someone's trying to integrate this into AppKit/UIKit/whatever on another platform, they will have to add custom logic to throttle their updates from the ProgressReporter
.
I also think there's an internal problem with frequent updates: basically, the way the API is constructed, whenever a child updates its properties:
- the parent constructs a new aggregate
Properties
instance - for each of its children,
- the parent reads the child's properties
- the parent calls
reduce
to fold those properties into the new aggregate
- the parent replaces its own properties with the newly-reduced version, which then recursively applies this procedure in its parent (throwing away any manual changes that were made to its own properties; that might be another issue?).
If children are spamming updates into a large tree, the overhead of this recomputation could easily overwhelm the actual operations being performed.
I see, thanks for the clarifications! I tried it out and it is true that when two threads try to mutate the same property of a ProgressReporter
, such as totalCount
, in place at the same time, the results become inconsistent. The end result sometimes reflects mutations from both threads (good), but sometimes reflects a mutation from only one thread (really bad).
In this case, would the best way to approach this be to expose the Mutex
backing totalCount
and properties
directly so that developers always have to do the mutations within withLock
closures?
Or would it be better to not expose a setter via the properties and instead introduce a method that will set
the property, similar to how completedCount
exposes a getter but will only be mutated through calls to the method complete(count:)
?
Yep, you're right about the constraints here. I think I could've been clearer with what I meant, let me clarify this in code, and feel free to let me know what you think about this!
Taking your example, the call site I was envisioning would look something like this:
// Me:
struct MyProperties: ProgressProperties { ... }
let myReporter = ProgressReporter<MyProperties>(totalCount: 200)
// to hold reference to traverse reporter
let traverseIntermediary = myReporter.assign(totalCount: 100, kind: DirectoryTraversalProperties.self)
let traverseReporter = traverseIntermediary.reporter(totalCount: 100)
// to hold reference to transcode reporter
let transcodeIntermediary = myReporter.assign(totalCount: 100, ImageTranscoderProperties.self)
let transcodeReporter = transcodeIntermediary.reporter(totalCount: 100)
// assign from traverse reporter & transcode reporter
try await traverse(
"path",
glob: "*.png",
progress: traverseReporter.assign(count: 100)
) { path in
try await transcode(
path,
toFormat: .jpeg,
dest: path.replacing(".png", with: ".jpeg"),
progress: transcodeReporter.assign(count: 100)
)
}
This way, we'll be able to access properties of from traversing the directory and transcoding images from the references that we hold to the traverseReporter
and transcodeReporter
.
One note for the potential users of this API as pitched; it would interoperate with the Observed
AsyncSequence when accessing the ProgressReporter
- so any calculation for that done in the closure passed to Observed
would emit a sequence of values composed from said progress.
Progress reporting is similar to metrics. What we are doing in swift-metrics
is only expose compound actions e.g. our counter has a method called increment. The underlying implementation can then decide how to safely implement the compound action. This can either be done by doing the get and set in a single withLock
"transaction" or in some cases backed by atomic operations.
Sounds good, thank you! I'll make sure to resolve the issues of backing totalCount
and properties
with a lock in this API.
In my mind, this makes the proposed design a non-starter. I need progress values to be part of multiple hierarchies, and it's a total PITA to get Foundation.Progress
to do this.
Here's a very very simple example:
- in my app, there's a central manager that downloads things
- downloads are uniquely identified with stable identifiers
- the download manager supports downloading packages of things
The problem:
- Multiple packages may request the same thing to be downloaded
- I don't want to download the same thing multiple times
- Each package should be reporting the progress of the single, unique download
Currently I've got a helper on Foundation.Progress
that creates a new progress that basically KVO's everything on the original and copies over the values to the new progress, because Foundation.Progress
throws an exception if you try to add the same progress as the child of multiple parents.
When I started reading this, I was really hoping to see an observation-based approach, and not a parent-child approach. It was quite disappointing to not see this addressed, since IMO this is one of the fatal flaws of Foundation.Progress
and makes this entire API wholly undesirable.
I've also been finding immense value in Foundation.Progress
's resumingHandler
, cancellationHandler
, and pausingHandler
. They've basically allowed me to use a Progress
almost like a future; I can return it out of the "async" method and then allow clients to pause and resume and cancel and can easily react to that.
I take it then that the expectation of this API is that I can't use it as-is and will always be expected to wrap it up in some sort of action-providing thing that implements pause()
, resume()
, and cancel()
?
Hi Dave! Yes, as mentioned in the alternatives considered section of the proposal, we decided that this API will not support cancellation, pausing and resuming the way Foundation.Progress
did due to two main reasons:
-
ProgressReporter
's goal is to report progress, and it does not assume ownership over the task that is reporting on. The consequence of this is thatProgressReporter
will not be responsible for acting upon the task that it is reporting on. -
Task cancellation is a feature that is available in Swift concurrency, and we encourage developers wishing to cancel a task to utilize that feature. This is to more clearly separate the responsibilities of a
Task
and aProgressReporter
.
If all the API is doing is conveying some information downstream to people listening for changes... then maybe progress reporting might be better modeled as an AsyncSequence of progress values with some special combineLatest()
operators to handle merging "child" progress values into a single, overall progress value.
Hi Dave, would you mind clarifying further what this progress tree would look like in this case?
Is it that there's an overall ProgressReporter
, and that it has children ProgressReporter
which would each address downloads happening in a single package?
Imagine you're building an app that supports downloading packages. Each package consists of one or more assets that must be downloaded. You establish some requirements:
- Each asset should only be downloaded once, regardless of how many packages it "belongs" to
- Each asset download should be able to report its individual progress
- Each package should be able to report its progress of downloading its referenced assets
- It would be nice to get a total progress of all asset downloads
- It would be nice to get a total progress of all packages
Drawing this out, you might end up with a progress reporting hierarchy like the following (A
through F
are the individual asset downloads; Pkg 1
through Pkg 3
are three packages of assets, which each reference 3 assets that must be downloaded. Arrows represent progress reporting for that thing).
As you can see, it is trivial to imagine scenarios where a single progress reporter would want to belong to multiple hierarchies. For example download A
technically would have direct "parents" of Pkg 1
, Pkg 2
, and Total Download Progress
.
This is why the single parent model is insufficient.
I think the continued usage of "parent" and "children" and "tree" is really complicating this pitch and design.
A thing produces progress and zero or more things want to consume that progress. I think we should be looking at producer/consumer or pub/sub styles of progress reporting, not strict and inflexible parent/child hierarchies.
Hi everyone! I've made some major updates to the proposal addressing some concerns brought up in this thread, feel free to check it out and I'd appreciate any feedback / comments / questions anyone has!
Very nice update, thank you!
For those wondering, the highlights are:
- v2 Major Updates:
- Replaced generics with@dynamicMemberLookup
to account for additional metadata
- Replaced localized description methods withProgressReporter.FormatStyle
andProgressReporter.FileFormatStyle
- Replaced top leveltotalCount
to be get-only and only settable viawithProperties
closure
- Added the ability forcompletedCount
to be settable viawithProperties
closure
- Omitted checking ofTask.cancellation
incomplete(count:)
method
And the diff is here: [Proposal] Progress Reporting in Swift Concurrency by chloe-yeo ยท Pull Request #1192 ยท swiftlang/swift-foundation ยท GitHub
Happy to see the cancellation change in complete(count:)
that makes more sense
Have you looked into the performance overhead of the dynamic member lookup approach? There are some situations it can be pretty fast but it's something that is on my mind whenever we reach for that.
allowing the freedom to add a ProgressReporter to more than one tree may compromise the safety guarantee we want to provide in this API.
So... does it compromise the safety or not? If it does, how? If it doesn't, why was this idea rejected?
Hi Dave, thanks for the question! The issue with compromising safety guarantee that we provide to developers came up previously, and is as follows:
Hope this answers your question!
Maybe I'm just not understanding very well, but it seems like we're saying "we've designed the API so that a single progress can't have multiple parents, and we can't change the design to allow multiple parents, because then our API couldn't safely enforce the restriction on multiple parents".
If we decided we did want to allow a progress to have multiple parents, then the API would naturally have to be designed differently. But are you saying that it's impossible for such a design to ensure safety?
I'm still confused. Isn't the issue of multiple reporters a result of dealing with reference types? And wouldn't that safety issue around ~Copyable
be entirely removed if this switched to a pub/sub model of value types?
I've been going back over the proposal and I'm concerned about its motivation and its goals. The motivation says that the intent is to make progress reporting "compatible with async/await style concurrency". There's nothing about making progress reporting better or fixing any of the longstanding issues that we've seen with Foundation.Progress
.
Because of that, this means that this effort (if approved in its current state) will result in an API that is functionally identical to Foundation.Progress
but with no enhancements nor improvements (and in fact worse in some ways because we'd be losing methods like pause()
and handlers like resumingHandler
). And since Foundation.Progress
has well-known shortcomings with its API, we'd be committing to perpetuating those shortcomings for the next few years until we decide to actually address them.
Can we address them now instead?
Can we instead be designing a progress reporting API that's a first-class citizen of the language and integrates with language constructs like Task
? Can we design an API that supports the strict hierarchical reporting like Foundation.Progress
, but also supports the multiple consumer model? Can we design an API that will allow deriving multiple values from the same progress, instead of being limited to the fundamental fractionCompleted: Double
that is imprecise and relies on conventions like "0 = indeterminate"?