Hi,
I am currently rethinking a project that contains about 70.000 lines of mostly Combine code. I used Combine not so much because of the concurrency aspect, but because of the reactive nature of the problem.
The code deals with multiple external event sources which posts updates via a network protocol. The code in parts is quite complex in the way external states are combined to higher level views and states. Combine, although initially quite challenging, solves the problem quite elegantly. Once you know what you're doing it has proven to produce a pretty reliable application with minimal problems.
But... looking at what Apple is doing in regards to async/await and Swift Testing, Combine seems to be a Zombie - and I am asking myself if I should rewrite code using again Combine or find a way to migrate to Swift concurrency.
I have seen quite a few posts about how AsyncSequence etc can do what a publisher does, and how Continuations can be used to bridge to async/await. But all the samples are super basic - and what a I need is at another level of complexity as I need quite complex recombination logic and I barely every have anything that's just a simple publisher, rather multiple publishers that will need to be recombined to a higher level state.
So my question is: What would you recommend? Should I continue with the Combine Zombie as it works or continue to spend time trying to understand how I can massage my code (and head) from reactive code to async await (somehow)?
My head is already dizzy :-) - here are some questions I am currently trying to answer:
How do I bridge from Network.framework to async/await (including the fact that network connections could be cancelled eg. an await read() -> Data needs to be cancelled.
How can I replace combination logic like (in the simplest case) CombineLatest, where I have to await multiple 'upstream' read calls and then calculate a new state based on that (partial) information when one of the calls returns a value.
I guess part of the problem is that I am thinking in reactive code atm - and it is super tough for me to figure out how I would do it with async/await.
It would be super helpful if you could point me to a examples or articles that change from Combine to async/await and covers more complex scenarios?
Sorry if I sound confused - it's largely because I am. All my async await experiments so far looked super awkward and seemed to be far more complicated than Combine.
It is hardly to recommend something without all context of your project, business and domain. But I can share my own experience.
I worked a lot with RxSwift, and a lot of code was also done with state machines. Rewriting to Combine was quite easy with minimum pitfalls. Some time adapters / converters between Rx / Combine are needed, until everything is rewritten.
I haven't done migration of the whole project to modern concurrency. But those part that were migrated took several times more time.
AsyncChannel / Stream work differently than Rx & Combine primitives. New kind of problems happen, like lack of multi-consumers with Async(Throwing)Stream, and that everything become async while Combine is sync by default.
Isolation, sensibility, actors and the whole bunch of concepts come with new concurrency, it is not possible just to use async Channels / Streams and ignore the rest.
As my project was decoupled to modules, I began to use modern concurrency incapsulated with no interface changes. I mean internally classes began to use new concurrency primitives, but their interfaces (protocols) were still the same. Some classes turned into actors.
When some of the modules have enough code rewritten to new stack, changes on interface are made. Modularization helped a lot to make adoption of new concurrency by small pieces.
Hmmm... I do have around 500 different event types from up to 30 sources and around 40 models that react to 30-40 of these events types each. Some of the stuff is time sensitive as well.
I am currently rewriting the base module, which is the network adapter connecting to the devices that generate the events. It establishes the communication basics - and all bad decisions made here tickle through the rest of the code (this is why I am rewriting it).
I am actually quite scared about the new kind of problems I may run into. (I have tried to understand what would be needed to enable a Continuation to be properly cancelled and it wasn't easily readable or obvious to me).
On the other hand I don't really know many people who know Combine at a level that is required to deal with my code and all the Combine flaws and bugs... just debugging is crazy sometimes... which poses scalability issues.
My instinct tell me to stay with what I can control - but I have been very wrong in the past :-)
You're probably using (or planning to use) withCheckedThrowingContinuation to bridge the Network.framework API to the async/await world, right?
The general pattern to support cancellation is to wrap the withCheckedThrowingContinuation call in a call to withTaskCancellationHandler. This calls you back when the surrounding task gets cancelled, giving you a chance to forward the cancellation to your network connection. I don't know much about Network.framework, but you cancelling the NWConnection will then hopefully trigger a callback from Network.framework into your code, which you can use to resume the continuation with a CancellationError().
Note that the onCancel: handler of withTaskCancellationHandler runs in a different concurrency context than the task in which you set up the network connection, so you may need to introduce a Mutex or similar to synchronize access to your state.
See also this thread for inspiration: Automatically Cancelling Continuations. (Although IIRC the semantics of @nikolai.ruhe's withCancellingContinuation API are to resume the continuation directly without going through the library you're interfacing with first – Network.framework in your case. Not sure if this is what you want.)
What I also can recommend is to dive deeply into lifecycle and share operators to understand it fundamentally.
In Rx there is a DisposeBag, in combine people typically store in Set<any Cancelable>. Set<any Cancelable> do nothing when deallocated, but DisposeBag does – it disposes all disables contained.
While the are differences between this two, conceptually you are responsible for keeping references to Disposable / AnyCancelable.
With new concurrency it differs more. Just one simple example:
Task { [weak self] in
for try await value in stream {
do {
try await doSmth(with: value)
} catch {
self?.forwardError(error)
}
}
}
There are two memory leaks can be introduced:
if you forget to capture self as weak. By default, Task closures work differently:
final class Foo: Sendable {
func foo() {
DispatchQueue.main.async {
self.bar() // need to capture self explicitly
}
_Concurrency.Task<Void, Never> {
bar() // no explicit self needed
}
}
func bar() {}
}
for try await value in stream can potentially be infinite. When Disposable / AnyCancelable destroyed, the whole computation chain also destroyed. But in the code above strong reference to Task is hold by Swift runtime, there is no DisposeBag to destroy it when it is not needed anymore.
To be clear, I don't try to tell you not to adopt new concurrency. It is great and I have great examples of using it.
Some points of such a migration:
understand the semantics an nuances. In this forum there are plenty of threads about Async(Throwing)(Stream)(Chanel). A good practice is to explore API in clean project.
not change root code that affects most of the codebase at first. Initialy it is better to change the code which is locally scoped and has minimal risks if broken
having more and more experience, change more complex code
You mean AnyCancellable instances call cancel() in their own deinit, not Set itself.
In Rx Disposable.cancel() is called by DisposeBag in for-each loop internally when DisposeBag is deallocated.
The general concept is the same, but design is slightly different.
It’s your call, but I would think long and hard about doing a rewrite just for the sake of adopting the newer tech stack. Combine hasn’t been deprecated. It is not a “zombie”. It still works.
Don’t get me wrong: I don’t use Combine in new projects and I prefer async-await. But I would not do a rewrite just for the sake of it. I would only contemplate it if you are solving some significant challenges and if there was sufficient return for that time investment.
For API that will call a completion handler once and only once, if it doesn’t already have an async rendition, you would wrap the legacy API in Swift concurrency using withChecked[Throwing]Continuation or the like. If it supports cancelation, also wrap it in a withTaskCancellationHandler.
And if there are API that will call a closure multiple types, we would wrap that in a custom AsyncSequence. And you would just integrate cancellation to make sure you propagate that to the underlying (e.g., using AsyncStream’s onTermination handler).
Yes, a Combine AnyCancellable is cancelled when it falls out of scope. As a result of that, when a Set<AnyCancellable> falls out of scope, all of the AnyCancellable in that Set are cancelled.
That is functionally similar to Rx’s DisposeBag pattern, namely that everything is canceled.
Now, if your point is that Swift concurrency behaves differently than Combine (and/or Rx), that tasks and streams don’t just cancel when they fall out of scope, I agree 100%. We have to manually cancel them (say, when a view disappears, or what have you). That having been said, if using SwiftUI’s .task view modifier, that will automatically cancel them for us when the view disappears.
I was a Combine newbee when I started this project... and there are a few design choices that bother me.
My problem actually is:
The Combine code is easy and a joy to write and I know what I am doing
VS
I would be a async/await newbie with a bunch of complicated nuts to crack - and I am pretty sure I won't get it right the first time.
I do actually use AnyCancellables a lot to manage the lifecycle of my objects.
I even have Subjects that count subscriptions to engage the logic that registers for events/data over network when subscribed. That way the UI dynamically manages the information it needs from remotes.
Stuff like that is actually really easy to build in reactive code - not sure how this would translate to async/await.
It is typical for reactive code to store Dispsables and use different kind if them like CompositeDisposable, BooleanDisposable, BinaryDisposable, SingleAssignmentDisposable and others.
However as I see in practice, things can often be done simpler, like save any Disposable to DisposeBag and forget about it.
That is the cases for what Refcount, Connectble Observable and similar stuff were invented.
Sure, that is one of the reasons why Combine was adopted in SwiftUI and why Rx became popular.
I have some time till end of the year, so I can try to help with this reactive logic if you wish. Thought it should better be done in direct messages I think.
Hi Dmitry,
thanks for the offer! I have decided to stay with Combine for now. The learning curve is just too steep for now. I have Combine under control and I am very productive with it - so I'll try to create my cleanest Combine code and reserve async await for a smaller spin off project.
So... since it is the holidays, I did not give up that easily (but spoiler alert, I am still staying with Combine for large parts). Slowly mixing in concurrency does not seem that hard though if you care about only pushing Sendable types through publishers (which is a good idea anyway!)
A few things that I feel are worth mentioning:
I discovered the 'values' property of Publisher.... The wrapping of a publisher into an AsyncSequence is already build in.
That was quite valuable to move my testing code to Swift Testing as I could write simple tests using async/await. I also found an implementation for waitForExpectation that was quite valuable. I quite like Swift Testing!
I had a look at the async-tools project on GitHub ( GitHub - apple/swift-async-algorithms: Async Algorithms for Swift), because I was curious how they would implement for example CombineLatest in swift concurrency. And I found what I suspected: the implementation creates a Sub-Task for every upstream it is monitoring - and the management seems to be complex. Anyway I am not convinced that MANY tasks that get woken up to pass information along the chains would ever be more efficient than a simple publisher chain executed in the context of a single dispatchQueue work item? I am not convinced that async-tools is actually the way to go if you want to do reactive code.
So the answer to my question if I'd move from Combine to Swift concurrency is: NO and YES.
I have been pretty intimidated by comments and posts claiming that using Combine and Concurrency together is dodgy - but at least for me I have discovered that they are actually a match made in heaven!! (at least to me)
What I have found to be working for me very well so far is:
I use Combine for the reactive parts of my code. In my current case this is all the update events I get from the network are pushed into publisher chains, which massage, recombine and filter into publishers that have meaning to my business logic or UI. I am making sure I only pass clean Sendable structs through the publishers and that they publish on the networks receive or transmit queue.
I use Concurrency (Tasks, Actors etc) for my models and processes. The fact that you can easily get a publisher as AsyncSequence through the Publisher.values property makes it really easy to implement logic in a linear way (rather than creating weird complex logic in Combine.)
I guess what I would not do is turn any swift concurrency constructs into Combine publishers.
In short Combine provides the easy recombination and timing of reactive code and Concurrency lets me use these publishers and create linear code with it.
I actually quite like it! And it has been a lot less complicate than I anticipated (so far).