OperationQueue + URLSession = no network activity OOB

beefon · January 30, 2020, 10:26am

I wrote a small CLI tool to perform quite a few network requests to my local server. It just does not work.

I started an investigation and crafted this snippet (good for playground):

gist.github.com

https://gist.github.com/beefon/0d5dbd6bffc0775bfd0fbed0785daecd

operationqueue_with_urlsession_no_activity.swift

import Dispatch
import Foundation

// If you run this playground, you will never see `Request feedback` message in console.
let queue = OperationQueue()

// but if you uncomment the line below, it suddenly starts working great
// queue.maxConcurrentOperationCount = 63
// but if you set maxConcurrentOperationCount to 64, it stops working again

This file has been truncated. show original

When you schedule a bunch of data tasks on operation queue, they aren't working well, unless you explicitly set maxConcurrentOperationCount to any value below 64.

I've spent 2 days finding this not very obvious reason of requests not being sent. Is it an internal bug of networking library or did I make any mistake in my sample code?

beefon · January 30, 2020, 12:09pm

Also, everything works okay if you schedule 63 operations to the queue with default maxConcurrentOperationCount. If you schedule 64, it blocks as well.

Reproducible on Xcode 11.3, OS X 10.14.6 / 10.15.2. Not sure about iOS though.

suyashsrijan · January 31, 2020, 12:58pm

It might be because GCD's max thread pool size is 64.

eskimo · February 2, 2020, 10:22pm

It might be because GCD's max thread pool size is 64.

Most likely, yes.

I suspect what’s actually causing the hang is that the operation queue is tying up all of Dispatch’s worker threads, but the operation queue needs Dispatch for its own synchronisation needs, and this work never makes progress because there are no worker threads left to service those requests.

beefon, You’re combining OperationQueue and URLSession in a way that runs counter to their design. URLSession is designed to run asynchronously. You are using a Dispatch group to to run it pseudo-synchronously. That’s generally not good. It’s especially not good if you’re trying to run lots of requests in parallel. Every request consumes a thread waiting for the Dispatch group to complete, and threads are expensive.

The correct way to integrate OperationQueue and URLSession is via an async operation (previously known as a concurrent operation). That is, subclass Operation, have it override isAsynchronous to indicate that it’s asynchronous, and then implement the start method to kick off the async work. See the Operation docs for details.

This is kinda tricky to get right, so you might want to look at how others have done it.

Regardless, this is more about Foundation than about Swift — everything here would apply equally if you were using NSOperation from Objective-C — so if you have follow-up questions you should pop on over to the Core OS > Concurrency topic area on DevForums and we can chat there.

Share and Enjoy

Quinn “The Eskimo!” @ DTS @ Apple

Jon_Shier · February 2, 2020, 10:32pm

In addition to what @eskimo said, URLSession also internally limits it's parallelism, so attempting to perform 64 requests at once doesn't really do anything. I believe the documented limits are 4 requests on iOS and 6 on macOS, but that documentation is many years old, so those limits may have changed. But in general, it doesn't really help to enqueue a lot of requests at once. In fact, if you do, those requests may time out while they're waiting to run. It's best to batch requests, only resuming new requests once one has completed.

beefon · February 3, 2020, 8:02am

Thanks eskimo. I'm aware about subclassing Operation for async operation pattern, but this is a small tool for my needs and I just implemented it in the shortest path possible way.

I didn't expect GCD to have a thread limit for all queues. I thought it would create independent threads for different queues internally.

Jon_Shier, thanks for sharing a bit of URLSession implementation details. 6 parallel requests sounds like a really small number to me though, especially if I have a gigabit ethernet and I can't utilize it using 6 parallel requests. Anyways, the details in this thread explain why session does not perform requests as expected.

eskimo · February 3, 2020, 8:55am

I just implemented it in the shortest path possible way.

The “shortest possible way” requires no operation queue at all:

let group = DispatchGroup()
for _ in 0...999 {
    group.enter()
    URLSession.shared.dataTask(with: …) { _, _, _ in
        …
        group.leave()
    }.resume()
}
group.wait()

This doesn’t suffer from the threading problem we discussed because it the requests are queued within the session.

I didn’t expect GCD to have a thread limit for all queues.

Dispatch is, alas, not magic. There has to be some limits. Without them, it’s very easy to run into a phenomenon known as thread explosion (in fact that’s still possible on the Mac, where Dispatch supports the notion of over commit). If you want to learn more about thread this, watch WWDC 2015 Session 718 Building Responsive and Efficient Apps with GCD. If you want an outline of current Dispatch best practices, watch WWDC 2017 Session 706 Modernizing Grand Central Dispatch Usage.

6 parallel requests sounds like a really small number to me though,
especially if I have a gigabit ethernet and I can't utilize it using 6
parallel requests.

The number Jon_Shier’s referring to is the number of parallel connections, not parallel requests. With the introduction of HTTP/2, these are no longer connected.

Also, this connection limit is enforced per host. Hence the property name, httpMaximumConnectionsPerHost.

Finally, it’s hard to offer any concrete advice about your performance concern without knowing more about your request pattern. I do have a couple of general points:

Having multiple parallel requests in flight generally only helps with latency issue, not bandwidth. The way to get the most bandwidth out of a link is to stream a single large request.
Optimising performance on gigabit Ethernet is a non-trivial problem and you’re unlikely to get where you want to be via the “shortest possible way”.

Share and Enjoy

Quinn “The Eskimo!” @ DTS @ Apple

beefon · February 3, 2020, 9:18am

Thanks Quinn for the links and deeper explanation. APIs are huge, and even though httpMaximumConnectionsPerHost contains that magic limit of 4 to 6 connections, it is easy to overlook it, shall you agree? This is actually helpful for me, because I perform requests to the same host.

Using DispatchGroup is even shorter way to run session tasks indeed. But it relies entirely on the fact that URLSession is an async API and it will rule out the threading, which is controversial way of problem solving to me, it is more like abusing that actually using the API.

My example is a pulp from my code to indicate that there is something process-wide happening around 64 operation count value, it was crafted to start this investigation. I thought I could upload and process something in parallel on the same (or different) queues, that's how I came up with this code snippet.

You can see there are a lot of ways to perform the same task: GCD groups, making async Operations, making synchronous requests from concurrent queue, maybe something else can be invented. I've chosen one way and found an obstacle, and it wasn't very obvious what was going on. The purpose was to explore and understand what is happening. This thread is not abount "here is the code, fix it for me": I've found an ugly workaround to limit queue concurrency anyways.