Compiler failing to optimize consuming sort

here’s an efficient sort, that ought to avoid reallocation if the argument was uniquely referenced.

@frozen public
struct Main
{
    public
    static func sort1(items:consuming [Int]) -> [Int]
    {
        items.sort()
        return items
    }

here’s the exact same operation, written functionally.


    public
    static func sort2(items:consuming [Int]) -> [Int]
    {
        items.sorted()
    }
}

probably, you would rather write the second one, and expect the compiler to optimize it into the first one. but that’s not what seems to be taking place.

static output.Main.sort1(items: __owned [Swift.Int]) -> [Swift.Int]:
        push    r12
        sub     rsp, 16
        mov     qword ptr [rsp], rdi
        mov     rdi, rsp
        xor     r12d, r12d
        call    (merged function signature specialization <Arg[0] = [Closure Propagated : generic not re-abstracted specialization <Swift.Array<Swift.Int>> of implicit closure #1 (A.Swift.Sequence.Element, A.Swift.Sequence.Element) -> Swift.Bool in (extension in Swift):Swift.MutableCollection< where A: Swift.RandomAccessCollection, A.Swift.Sequence.Element: Swift.Comparable>.sort() -> (), Argument Types : []> of generic specialization <[Swift.Int]> of (extension in Swift):Swift.MutableCollection< where A: Swift.RandomAccessCollection>.sort(by: (A.Swift.Sequence.Element, A.Swift.Sequence.Element) throws -> Swift.Bool) throws -> ())
        mov     rax, qword ptr [rsp]
        add     rsp, 16
        pop     r12
        ret

static output.Main.sort2(items: __owned [Swift.Int]) -> [Swift.Int]:
        push    r12
        push    rbx
        sub     rsp, 24
        mov     rbx, rdi
        mov     qword ptr [rsp + 8], rdi
        call    swift_retain@PLT
        lea     rdi, [rsp + 8]
        xor     r12d, r12d
        call    (merged function signature specialization <Arg[0] = [Closure Propagated : generic not re-abstracted specialization <Swift.Array<Swift.Int>> of implicit closure #1 (A.Swift.Sequence.Element, A.Swift.Sequence.Element) -> Swift.Bool in (extension in Swift):Swift.MutableCollection< where A: Swift.RandomAccessCollection, A.Swift.Sequence.Element: Swift.Comparable>.sort() -> (), Argument Types : []> of generic specialization <[Swift.Int]> of (extension in Swift):Swift.MutableCollection< where A: Swift.RandomAccessCollection>.sort(by: (A.Swift.Sequence.Element, A.Swift.Sequence.Element) throws -> Swift.Bool) throws -> ())
        test    r12, r12
        jne     .LBB2_2
        mov     rdi, rbx
        call    swift_release@PLT
        mov     rax, qword ptr [rsp + 8]
        add     rsp, 24
        pop     rbx
        pop     r12
        ret
2 Likes

The first one, items.sort(), requires a MutableCollection & RandomAccessCollection.

The second one, items.sorted(), only requires a Sequence, so an intermediate ContiguousArray is used by the implementation.

If I "manually inline" the second one, then its output is identical to the first one in Compiler Explorer.

1 Like

True, but the types are concrete in @taylorswift's examples - Array.

In any case, for any given specific use of even an existential input, the types might be known, and the potential exists for optimisation. Both through inlining (probably the easier case) and through specialisation (hypothetically the compiler could voluntarily generate two versions of sort2 and use the best applicable one in each instance).

The compiler would have to be able to prove, however, that sorted and sort produce identical results. Which means seeing into their implementations and that of the input type (Array in these examples).

To my knowledge the compiler does nothing like this (looking for equivalent, independent functions) - all its optimisations are based on pre-defined, hard-coded transformations of the base case.

2 Likes

the sorted implementation is @inlinable, so any optimization that can take place by manually pasting the function body ought to be automatable. compiler bug?

it looks like the compiler already knows how to do this optimization, it is just failing to make use of @inlinable information.