Why is sort()'s use of insertion sort up to 20 items?

Paul Hudson says in his book Swift Coding Challenges “insertion sort is much like bubble sort: it’s generally considered a bad choice for a real-world search algorithm, but it’s easy enough to teach. However, insertion sort is always faster than bubble sort, and in fact it’s the preferred choice of algorithm if you’re sorting small data sets – Swift’s own sort() call uses it if your collection contains fewer than 20 items.”

Why is the cutoff at 20 items specifically? What sorting algorithm(s) are used by this API for larger data sets?

Since Swift 5, the implementation of sort has been Timsort, a hybrid stable sorting algorithm. This is an implementation detail that can change at any time. For details on when insertion sort is used as part of Timsort, see the linked article and the references cited there.


To add to @xwu's comment, while the 20-element cutoff for insertion sort may have been true at one time (the number sounds vaguely familiar to me), this isn't what the current sorting implementation does. Specifically, the implementation follows what the linked Wikipedia article says about minimum run sizes (i.e., the minimum merge-sort subarray sizes where insertion sort is used):

Minrun is chosen from the range 32 to 64 inclusive, such that the size of the data, divided by minrun , is equal to, or slightly less than, a power of two. The final algorithm takes the six most significant bits of the size of the array, adds one if any of the remaining bits are set, and uses that result as the minrun . This algorithm works for all arrays, including those smaller than 64; for arrays of size 63 or less, this sets minrun equal to the array size and Timsort reduces to an insertion sort.[11]

Similarly, in Sort.swift:

/// "... pick a minrun in range(32, 65) such that N/minrun is exactly a power
/// of 2, or if that isn't possible, is close to, but strictly less than, a
/// power of 2. This is easier to do than it may sound: take the first 6 bits
/// of N, and add 1 if any of the remaining bits are set."
/// - From the Timsort introduction, at
/// https://svn.python.org/projects/python/trunk/Objects/listsort.txt

To echo what @xwu said: this is all implementation detail which is subject to change at any point (so long as the publicly-documented API requirements are met).

Edit: ah, yes, you can see in the PR that migrated away from Introsort — the old Introsort code hard-coded a 20-element bound before switching over to insertion sort. (For similar reasons as Timsort) The bound there may have been empirically chosen as a good-enough tradeoff between the two algorithms.


Found another sort that I’d share when educating myself on timsort:

At least by benchmarks and time/space complexity seems interesting (beating timsort).