Hi,
what is the best way to increase the size of the stack?
The problem: related to apples documentation [1] the stack size of a macOS main threads is 8MB and 512KB of a secondary thread (on iOS the main thread has a stack size of 1MB). This sounds reasonable for the memory of > 10years ago, but with more complex algorithms (special with much recursion) and swift on the server it would be good to increase this limit.
But I see only the possibility to change the stack-size when I do 'manual' thread-handling with NSThread [2]. We hopefully agree, that in times of grand central dispatch handling NSThreads "manual" is not the best way to do.
Are there any other approaches?
At the moment I have the problems only with secondary threads, but changing the size of all threads easy should be possible. I can buy now an imac pro with 128GB memory, a macbook with 32GB, an AWS instance with much more memory, but secondary threads have a 512kb limit, this doesn't sound correct.
I have the problem with stack overflows (because of too small stack sizes) on macOS and on ubuntu.
One of the ideas of swift was to use more structs and immutable data instead of classes etc.. but structs need memory on the stack and with such a small stack size this is a problem (or doesn't fit to the "struct"-idea of swift)
For the main thread there is usually a platform-specific way to set its size. For Apple platforms this is done via the linker’s -stack_size argument. See the man page for details.
The situation with secondary threads depends on whether you’re using Dispatch or not:
If you’re using an explicit thread API, that API is likely to include some way to configure the stack size. It seems like you’ve already found this for NSThread.
If you’re using Dispatch then there’s no way to configure the stack size of the thread that runs your closures.
With regards this last point, you should feel free to file an enhancement request for that feature, although be aware that implementing this would be tricky. Dispatch maintains a pool of worker threads, and there would have to be some way to bind a specific queue to a specific stack size, and thus to a specific subset of worker threads, and once you dive into the details things get really complex.
As to what you can do right now, you have two options:
Avoid Dispatch and manage your own threads (A).
Work to reduce your stack usage (B).
Option A is not as bad as you might think. You wrote:
We hopefully agree, that in times of grand central dispatch handling
NSThreads "manual" is not the best way to do.
Things are more subtle than that. Dispatch is a good option for many use cases but it’s not the only option. The explicit thread APIs continue to exist because they are various situations where using an explicit thread is better than using Dispatch. This is one of those.
With regards option B, be aware that many standard library types are relatively ‘stack light’ because they store their underlying data in a CoW heap buffer. This includes String and Array. You may be able to make good progress adopting this technique in your own data types.
Share and Enjoy
—
Quinn “The Eskimo!”
Apple Developer Relations, Developer Technical Support, Core OS/Hardware let myEmail = "eskimo" + "1" + "@apple.com"
At the moment we are using dispatch (ok, we are not using dispatch, but Kitura is using dispatch).
I will create an enhancement request, thanks. Perhaps this will fix the problem in the future.
For me it would be fine, if I can change the size of the stacks of all secondary threads for my application, something like -secondary_thread_stack_size.
We tried option B, but this isn't possible. Most of our underlying data structures are structs with enums / ints, and because it is a recursive system, it is possible to have a function stack of > thousands functions. I know this isn't a typical problem, but our problems are structured this way.
So it sounds like we have to use option A. This is manageable but "not nice" (because more complexity^^). My biggest concern is: there is no option for us to change the thread-behaviour of kitura, so kitura will create new threads because they use a dispatch and additional we will create new threads with a larger stack size. This is not ideal, but should work in any way.
Recursively stack allocating a lot of temporary data is not exactly what that talk was advocating, nor is it good practice in any case. Regular recursion is fine when the recursion depth is shallow (unless your temporaries are very, very large), but if it's not then you should either use tail-recursion, a loop, or map/reduce/forEach/etc.
Swift can usually optimize tail-recursion into just a jump, so it shouldn't thrash your stack as much. Just be careful about passing parameters that participate in ARC, like most collections or indirect enums. If you're still using Swift <= 4.1, then they'll be passed with a +1 refcount and the optimizer might not be able to optimize out all of the retains and releases, which would prevent the tail-recurision optimization.
Your a re right, it is not a good practice to get too much on the stack, but depending on the problem this could happen. And yes, tail-recursion reduces the amount of recursion but it doesn't solve all the problems. and for some algorithms it isn't very easy to implement tail recursion (like parser combinators).
We use swift 4.1, so this shouldn't be our problem. but do you have more informations about tail-recursion optimisation in swift? Because I only know the article [1] which is telling "tail call optimization (and thus tail recursion) is too fragile. Don’t rely on it." (but the article is some years old).
I don't think "I should increase the stack size" is an appropriate response to "my recursive program is running out of stack space". You would always be one over-sized problem set away from a stack overflow error.
Just avoid recursion, especially non-tail recursion.
Basically, unless there are retain and release calls that the compiler can't remove, Swift will try to optimize tail-calls. It can even optimize mutually recursive tail calls. There might be some weird edge cases that it can't optimize, but they're few and far between.
The article you posted was not only four years old, it was also about the Objective-C compiler, which optimizes differently than Swift (if only because Swift does more high-level optimizations before it generates LLVM IR code). I don't know if Clang still has issues with ARC suppressing tail-call optimization, but anything that's pure C should tail-call optimize just fine.
You are right, "increasing the stack size" shouldn't be the answer to all problems. But time goes on and today we don't write assembly, we write swift, since arc we we don't write manually retain-release... As I remember correct, the stack size of a thread in os X 10.0 was 512kb, this is 17 years ago.
"Just avoid recursion" Recursion is a not a bad concept, special when you are into functional programming. Yes, you shouldn't solve all your problems with recursion, but some problems are good to solve with recursion.
There is a difference between "are good to solve with recursion" (because recursion fits very well to the problem) and "are good to solve with recursion" (because they need less stack-space).
You could solve quick sort with recursion, and for small arrays that’s fine. But larger arrays will require more recursion. There will then always be a point where you are limited by the recursion memory even if you have plenty of memory on your machine to allocate. At this point your only option is to rework the function, even though that does not provide any additional or significant benefits to speed and memory usage.
Yes, and to be honest I don‘t understand why everybody here is against it. these memory limits are perhaps(!) useful for macOS iOS apps, but for server applications -where most of the code rund at secondary threads- this feels like an issue.
sure, I can create my own (NS)Thread model, but why do we have all the nice frameworks?
I faced with the stack issue and even migrating everything onto main thread didn't solve my issue. We are using structs for network models and root structs are pretty big. On top of that we have some complex RxSwift chains. What I observed that the app started to crash randomly because of stack overflow. We don't have any recursion algorithms but crashed stack trace has ~260 RxSwift calls. There is of course a possibility to put async dispatches everywhere but this looks like a workaround that doesn't really solves the core issue. Instead, I had to rework the biggest models onto classes. That's sad because I started to use structs following WWDC session advice.
Here's a tiny example program that illustrates the severity of the stack size limitation:
import Dispatch
struct TenOf<T> { let t1, t2, t3, t4, t5, t6, t7, t8, t9, t10: T }
typealias LakhOf<T> = TenOf<TenOf<TenOf<TenOf<TenOf<T>>>>> // 100K
let onmain: LakhOf<Int64>? = nil // works on main thread…
print("size of lakh:", MemoryLayout.size(ofValue: onmain)) // 800_001
DispatchQueue.concurrentPerform(iterations: 2) { i in
let onq: LakhOf<Int64>? = nil // …but crashes on dispatch queue!
print("size of lakh (on queue):", MemoryLayout.size(ofValue: onq))
}
The proposed work-around of using ones' own Thread instead of GCD isn't viable in many cases, because you may be working with with built-in frameworks that use GCD, such as Foundation networking. Deserializing JSON into potentially large structs from a network callback is very common, and having to fork a new thread from a perfectly good GCD queue is not a viable work-around for high-performance systems.
And the other work-around of refactoring data models from value types to classes is infeasible, to say the least.
If changing a queue's stack size at runtime presents challenges (and I see why it would), why can't there be some environment variable that controls the default stack size, like Java's -Xss flag?
I don't know enough to say one way or another if a stack should be bigger, but I don't think that's a good motivating example.
An 800 KB struct is quite large, and it would really benefit from being a class, passed around cheaply by reference, particularly on mobile platforms like watch and iOS.