Data(contentsOf:) fails on Linux where it doesn't on macOS

I'm porting one of my projects to Linux and I've run into an annoying issue relating to the Data(contentsOf:) initialiser. In this particular case I'm using that initialiser to load json files from the internet, and it works perfectly on macOS. However, on Linux I get the following error (for the exact same files): Error Domain=NSCocoaErrorDomain code=256 "(null)". Ignoring the fact that that's an absolutely terrible error for a stdlib to be throwing, that initialiser is not one that I would've expected to have discrepancies between platforms. I have seen people saying not to use that initialiser and to use URLSession instead, but I don't see why it should fail, and URLSession is pretty cumbersome to just whip out, especially in synchronous code.

To reproduce the issue on Ubuntu 22.04 you can run the following code:

import Foundation
import FoundationNetworking

Data(contentsOf: URL(string: "https://gitlab.bixilon.de/bixilon/pixlyzer-data/-/raw/c623c21be12aa1f9be3f36f0e32fbc61f8f16bd1/version/1.16.1/items.min.json")!)

I tried rewriting the code to use URLSession and now I get a new error (probably the underlying cause for the initial cryptic error). The new error is NSURLErrorDomain error -1001 and the code is below. I have dodgily bridged it into synchronous code because the codebase is big and I'd rather not propagate async/await everywhere until I've figured out a working solution. Googling around a bit I found people saying that it could be a timeout related issue, however the file is around 300kb and loads in about half a second on my network (my browser loads it fine and so does macOS).

let box: Box<Result<Data, Error>?> = Box(nil)
let semaphore = DispatchSemaphore(value: 0)
let task = URLSession.shared.dataTask(with: url) { data, _, error in
  if let data = data {
    box.value = .success(data)
  } else if let error = error {
    box.value = .failure(error)
  }
  semaphore.signal()
}
task.resume()

semaphore.wait()

if let result = box.value {
  return try result.get()
} else {
  throw RequestError.unknownError
}

Oddly enough, this new code has worked once out of the 15 or so times that I've run it, so it seems to be a stability related issue.

I'm posting this to see if anyone else has any ideas or experience with this error. Should I be opening an issue on the swift-corelibs-foundation repo for this as well?

I assumed it only works on "file" urls, and even if that's not enforced I'd trust documentation:

You can also use it to read short files synchronously.

Also consider using this async wrapper:

let data = await URLSession.shared.data(from: url, delegate: nil)

Works for me, Ubuntu 22.04.1 aarch64, Swift 5.7 release. Both from the repl and a binary.

Which version of Swift are you using?

Also, -1001 is indeed a timeout. Perhaps it has something to do with how your mac/linux machines are connected to the network?

I did check the source code and it has one branch for file urls and one branch for network urls. And that recommendation seems to just be there to stop people blocking threads, which doesn't really matter in my situation.

That async version of the function is sadly only available on Monterey and I need to support Big Sur. But I did try that one on Linux and it also ran into the same issues, so it seems to be a bigger issue in Foundation's network request handling than just the Data(contentsOf:) initialiser (which imo should still work).

I'm on Ubuntu 22.04.1 x86_64 Swift 5.7.1 release. Both repl and binary are broken for me.

That's what I was thinking too, but I've just tried out the SwiftyRequest library (which doesn't use Foundation for networking) and it works perfectly fine (although I'd prefer for that not to be the permanent solution because it relies on SwiftNIO which adds a big chunk to my build times). Also, the Data(contentsOf:) initialiser is consistently working fine in another part of my code where I download a much bigger file (around 15mb iirc). It's such a strange issue :thinking:

I see. Just to troubleshoot the issue try using "request" version of dataTask specifying your own timeout:

dataTask(with: URLRequest(url: url, timeoutInterval: 60))

Will it block for 60 seconds and then fail with timeout error?

Yep, it just fails after a minute

Another idea: maybe task completes right on "resume" in some cases?

task.resume()
semaphore.wait()

in this case it's worth trying putting "wait" before "resume".

(yeah, that was stupid idea :slight_smile: But at least check if this can happen that resume completes the task straight away (e.g. using print logging, etc). If that's the problem - two possible fixes: either don't wait on semaphore in this specific case, or postpone "signal" via queue.async {} call so that wait can happen first).
Edit: as corrected below, there is no race condition.

Putting wait before resume would hang forever because the code would never get to resume afaict

Thanks for the help, I've narrowed down the cause a bit more now, and I'll probably open an issue for it soon. Sadly I still don't really know the cause, especially given that it works fine for you @Karl, but maybe looking at the code I'll find some possible code paths that produce that error and seem likely.

Since you say the dataTask version works 1/15 times, what happens if you try the synchronous version in a loop? Like, say you try 100 times, does it ever succeed?

for _ in 0..<100 {
  if let data = try? Data(contentsOf: /*...*/) {
    print("Worked!")
    break
  }
}

Also, since Foundation uses libcurl underneath, can curl (on the command line) fetch that resource?

It might? But each iteration would take 30 seconds or more (the initialiser does seem to be timing out for whatever reason), so it wouldn't be practical. I'm not at my laptop anymore so I can't test it, but it would also probably take a very long time to test :sweat_smile: I might try running the URLSession method in a for loop instead and setting a timeout of 5 seconds or so and seeing if it ever succeeds, because when it succeeds it takes less than a second.

Yeah, curl can download it successfully

There's definitely a race condition here (*), to make it more obvious try putting "usleep(100_000)" between these two calls, I bet in this case it will always fail to work.

However there might be something else: as written, this window is very small so probably this alone doesn't explain the issue you are experiencing 15 times out of 16.

(*) several ways to avoid this race condition, one would be specifying your own URLSession with a custom serial queue and calling "task.resume" on that queue.

Edit: as corrected below, there is no race condition.

Also, the Data(contentsOf:) initialiser is consistently working fine in another part of my code where I download a much bigger file

Is that from the same server, or a different one? I have only read what's on this thread and not tested anything, but it smells a little like DNS problems could be playing a role.

1 Like

I don’t believe it’s a race condition, there would be so little time between those two calls, yet the file takes about 400ms to download, and the boxed result is never nil, it always gets set by the completion handler before the function returns

It’s a different server, tonight I’ll try seeing if it happens for every endpoint on that server not just the particular one that’s giving me issues.

Found this old thread that also talks about URLSession timing out on linux.

I agree that in this case this is not what's happening , as the race condition I am talking about would have lead to the app being indefinitely blocked on the semaphore wait, which is not what you are observing.

Edit: as corrected below, there is no race condition.

Presuming there's a bug in URLSession implementation would it help if you create a new session instead of using shared?

let session = URLSession()
let task = session.dataTask(with: url) { data, _, error in
...

That results in the same error sadly.

And I thought about the race condition some more and I don't think it's a race condition. Because if the data task finishes before semaphore.wait(), the semaphore's value will already be 1 so wait will just set it back to 0 and continue execution basically immediately.

1 Like