Hello everyone,
I've been struggling with a persistent issue for several weeks and would greatly appreciate any insights or suggestions from the community.
Problem Summary
We are sending JSON requests (~100 KB in size) via URLSession from a Swift app running on Windows. These requests consistently time out after a while. Specifically, we receive the following error:
Error Domain=NSURLErrorDomain Code=-1001 "(null)"
This only occurs on Windows – under macOS and Linux, the same requests work perfectly.
Details
- The server responds in under 5 seconds, and we have verified that the backend (a Vapor app in Kubernetes) is definitely not the bottleneck.
- The request always hits the timeout interval, no matter how high we configure it: 60, 120, 300, 600 seconds – the error remains the same. (timeoutForRequest)
- The request flow: Swift App (Windows) ---> HTTPS ---> Load Balancer (NGINX) ---> HTTP ---> Ingress Controller ---> Vapor App (Kubernetes)
- On the load balancer we see this error:
client prematurely closed connection, so upstream connection is closed too (104: Connection reset by peer)
- The Ingress Controller never receives the complete body in these error cases. The content length set by the Swift app exceeds the data actually received.
- We disabled request buffering in the Ingress Controller, but the issue persists.
- We even tested a setup where we inserted a Caddy server in between to strip away TLS. The Swift app sent unencrypted HTTP requests to Caddy, which then forwarded them. This slightly improved stability but did not solve the issue.
Additional Notes
- The URLSession is configured in an actor, with a nonisolated URLSession instance:
actor DataConnectActor {nonisolated let session : URLSession = URLSession(configuration: { let urlSessionConfiguration : URLSessionConfiguration = URLSessionConfiguration.default urlSessionConfiguration.httpMaximumConnectionsPerHost = ProcessInfo.processInfo.environment["DATACONNECT_MAX_CONNECTIONS"]?.asInt() ?? 16 urlSessionConfiguration.timeoutIntervalForRequest = TimeInterval(ProcessInfo.processInfo.environment["DATACONNECT_REQUEST_TIMEOUT"]?.asInt() ?? 120) urlSessionConfiguration.timeoutIntervalForResource = TimeInterval(ProcessInfo.processInfo.environment["DATACONNECT_RESSOURCE_TIMEOUT"]?.asInt() ?? 300) urlSessionConfiguration.httpAdditionalHeaders = ["User-Agent": "DataConnect Agent (\(Environment.version))"] return urlSessionConfiguration }()) public internal(set) var accessToken: UUID? = nil...}
- Requests are sent via a TaskGroup, limited to 5 concurrent tasks.
- The more concurrent tasks we allow, the faster the timeout occurs.
- We already increased the number of ephemeral ports in Windows. This made things slightly better, but the problem remains.
- Using URLSessionDebugLibcurl=1 doesn't reveal any obvious issue related to libcurl.
- We have also implemented a retry mechanism, but all retries also time out.
Request Flow (Code Snippet Summary)
let data = try JSONEncoder().encode(entries)var request = URLRequest(url: url)request.httpMethod = "POST"request.httpBody = datarequest.setValue("Bearer \(token)", forHTTPHeaderField: "Authorization")request.setValue("application/json; charset=UTF-8", forHTTPHeaderField: "Content-Type")// additional headers... let (responseData, response) = try await urlSession.data(for: request)
What We’ve Tried
- Tested with and without TLS
- Increased timeout and connection settings
- Disabled buffering on Ingress
- Increased ephemeral ports on Windows
- Limited concurrent requests
Used URLSessionDebugLibcurl=1
We don't know how we can look any further here.
Thank you in advance for any guidance!