Are structs really always pessimistically copied when calling funcs?


(Karl Pickett) #1

I have a struct and this code:

func test() {
    precondition(sizeof(Foo) == 128)

    let s = Foo()
    for _ in 0..<100_000_000 {
        doSomething(s)
    }
}

The asm (on LInux, with -O) is showing me that s is being re-initialized on
every iteration of the loop. I was hoping that thanks to swift's strict
constness rules on structs, it wouldn't have to do this - and just pass the
same pointer to doSomething() each time.

When I use an inout param, that is 2x as fast and doesn't re-initialize
each time. However I don't see why passing something immutably wouldn't be
as fast.

- Karl

asm from perf:

  2.71 │50:┌─→xorps %xmm0,%xmm0
                                               ▒
  8.06 │ │ movaps %xmm0,-0x20(%rbp)
                                               ▒
  2.71 │ │ movaps %xmm0,-0x30(%rbp)
                                               ▒
  7.41 │ │ movaps %xmm0,-0x40(%rbp)
                                               ▒
10.59 │ │ movaps %xmm0,-0x50(%rbp)
                                               ▒
10.00 │ │ movaps %xmm0,-0x60(%rbp)
                                               ▒
  9.53 │ │ movaps %xmm0,-0x70(%rbp)
                                               ▒
10.65 │ │ movaps %xmm0,-0x80(%rbp)
                                               ▒
11.24 │ │ movaps %xmm0,-0x90(%rbp)
                                               ▒
12.06 │ │ mov %r14,%rdi
                                               ▒
  3.41 │ │→ callq _TF4main11doSomethingFVS_3FooT_
                                               ▒
  2.82 │ │ dec %rbx
                                              ▒
  8.82 │ └──jne 50

main.swift:

struct Vec4 {
    var a: Int64 = 0
    var b: Int64 = 0
    var c: Int64 = 0
    var d: Int64 = 0
}

struct Foo {
    var x: Vec4 = Vec4()
    var y: Vec4 = Vec4()
    var z: Vec4 = Vec4()
    var u: Vec4 = Vec4()
}

func test() {
    precondition(sizeof(Foo) == 128)

    let s = Foo()
    for _ in 0..<100_000_000 {
        doSomething(s)
    }
}

test()

lib.swift:

func doSomething(s: Foo) {
    precondition(s.x.a != 1)
}


(Jens Alfke) #2

Huh. That’s especially weird since the semantics of inout actually call for two copies (after the called function returns, the copy of the struct that was passed to it gets copied back into the original variable.) The compiler is often able to optimize that down to the more-expected pass-by-pointer, as in your example. So why then isn’t it able to optimize the non-inout case the same way?

—Jens

···

On Dec 6, 2015, at 5:16 PM, Karl Pickett via swift-users <swift-users@swift.org> wrote:

When I use an inout param, that is 2x as fast and doesn't re-initialize each time. However I don't see why passing something immutably wouldn't be as fast.


(Joe Groff) #3

I have a struct and this code:

func test() {
    precondition(sizeof(Foo) == 128)

    let s = Foo()
    for _ in 0..<100_000_000 {
        doSomething(s)
    }
}

The asm (on LInux, with -O) is showing me that s is being re-initialized on every iteration of the loop. I was hoping that thanks to swift's strict constness rules on structs, it wouldn't have to do this - and just pass the same pointer to doSomething() each time.

When I use an inout param, that is 2x as fast and doesn't re-initialize each time. However I don't see why passing something immutably wouldn't be as fast.

This definitely seems like a place where we ought to be able to peephole the extra copies away. Mind filing a bug?

-Joe

···

On Dec 6, 2015, at 5:16 PM, Karl Pickett via swift-users <swift-users@swift.org> wrote:

- Karl

asm from perf:

  2.71 │50:┌─→xorps %xmm0,%xmm0 ▒
  8.06 │ │ movaps %xmm0,-0x20(%rbp) ▒
  2.71 │ │ movaps %xmm0,-0x30(%rbp) ▒
  7.41 │ │ movaps %xmm0,-0x40(%rbp) ▒
10.59 │ │ movaps %xmm0,-0x50(%rbp) ▒
10.00 │ │ movaps %xmm0,-0x60(%rbp) ▒
  9.53 │ │ movaps %xmm0,-0x70(%rbp) ▒
10.65 │ │ movaps %xmm0,-0x80(%rbp) ▒
11.24 │ │ movaps %xmm0,-0x90(%rbp) ▒
12.06 │ │ mov %r14,%rdi ▒
  3.41 │ │→ callq _TF4main11doSomethingFVS_3FooT_ ▒
  2.82 │ │ dec %rbx ▒
  8.82 │ └──jne 50

main.swift:

struct Vec4 {
    var a: Int64 = 0
    var b: Int64 = 0
    var c: Int64 = 0
    var d: Int64 = 0
}

struct Foo {
    var x: Vec4 = Vec4()
    var y: Vec4 = Vec4()
    var z: Vec4 = Vec4()
    var u: Vec4 = Vec4()
}

func test() {
    precondition(sizeof(Foo) == 128)

    let s = Foo()
    for _ in 0..<100_000_000 {
        doSomething(s)
    }
}

test()

lib.swift:

func doSomething(s: Foo) {
    precondition(s.x.a != 1)
}
_______________________________________________
swift-users mailing list
swift-users@swift.org
https://lists.swift.org/mailman/listinfo/swift-users


(Karl Pickett) #4

I created https://bugs.swift.org/browse/SR-110 for the compiler. However,
I also think that the documentation needs an issue filed. (Where to do
that at?)

The current docs say structs are always copied (the only exception being
inout to memory variable optimization). That would make programmers
worried about speed and stack usage run away screaming, and not give swift
a try.

···

On Mon, Dec 7, 2015 at 11:53 AM, Joe Groff <jgroff@apple.com> wrote:

On Dec 6, 2015, at 5:16 PM, Karl Pickett via swift-users < > swift-users@swift.org> wrote:

I have a struct and this code:

func test() {
    precondition(sizeof(Foo) == 128)

    let s = Foo()
    for _ in 0..<100_000_000 {
        doSomething(s)
    }
}

The asm (on LInux, with -O) is showing me that s is being re-initialized
on every iteration of the loop. I was hoping that thanks to swift's strict
constness rules on structs, it wouldn't have to do this - and just pass the
same pointer to doSomething() each time.

When I use an inout param, that is 2x as fast and doesn't re-initialize
each time. However I don't see why passing something immutably wouldn't be
as fast.

This definitely seems like a place where we ought to be able to peephole
the extra copies away. Mind filing a bug?

-Joe

- Karl

asm from perf:

  2.71 │50:┌─→xorps %xmm0,%xmm0
                                               ▒
  8.06 │ │ movaps %xmm0,-0x20(%rbp)
                                               ▒
  2.71 │ │ movaps %xmm0,-0x30(%rbp)
                                               ▒
  7.41 │ │ movaps %xmm0,-0x40(%rbp)
                                               ▒
10.59 │ │ movaps %xmm0,-0x50(%rbp)
                                               ▒
10.00 │ │ movaps %xmm0,-0x60(%rbp)
                                               ▒
  9.53 │ │ movaps %xmm0,-0x70(%rbp)
                                               ▒
10.65 │ │ movaps %xmm0,-0x80(%rbp)
                                               ▒
11.24 │ │ movaps %xmm0,-0x90(%rbp)
                                               ▒
12.06 │ │ mov %r14,%rdi
                                               ▒
  3.41 │ │→ callq _TF4main11doSomethingFVS_3FooT_
                                               ▒
  2.82 │ │ dec %rbx
                                                ▒
  8.82 │ └──jne 50

main.swift:

struct Vec4 {
    var a: Int64 = 0
    var b: Int64 = 0
    var c: Int64 = 0
    var d: Int64 = 0
}

struct Foo {
    var x: Vec4 = Vec4()
    var y: Vec4 = Vec4()
    var z: Vec4 = Vec4()
    var u: Vec4 = Vec4()
}

func test() {
    precondition(sizeof(Foo) == 128)

    let s = Foo()
    for _ in 0..<100_000_000 {
        doSomething(s)
    }
}

test()

lib.swift:

func doSomething(s: Foo) {
    precondition(s.x.a != 1)
}
_______________________________________________
swift-users mailing list
swift-users@swift.org
https://lists.swift.org/mailman/listinfo/swift-users


(Joe Groff) #5

I created https://bugs.swift.org/browse/SR-110 for the compiler. However, I also think that the documentation needs an issue filed. (Where to do that at?)

Thanks!

The current docs say structs are always copied (the only exception being inout to memory variable optimization). That would make programmers worried about speed and stack usage run away screaming, and not give swift a try.

If weren't already been chased away by 'vars are always allocated on the heap'. The docs generally discuss high-level semantic behavior rather than the real code emitted; in general, users can count on structs being copied whenever necessary to preserve value semantics between different names. How would you suggest rewording the documentation?

-Joe

···

On Dec 7, 2015, at 10:27 AM, Karl Pickett <karl.pickett@gmail.com> wrote:

On Mon, Dec 7, 2015 at 11:53 AM, Joe Groff <jgroff@apple.com <mailto:jgroff@apple.com>> wrote:

On Dec 6, 2015, at 5:16 PM, Karl Pickett via swift-users <swift-users@swift.org <mailto:swift-users@swift.org>> wrote:

I have a struct and this code:

func test() {
    precondition(sizeof(Foo) == 128)

    let s = Foo()
    for _ in 0..<100_000_000 {
        doSomething(s)
    }
}

The asm (on LInux, with -O) is showing me that s is being re-initialized on every iteration of the loop. I was hoping that thanks to swift's strict constness rules on structs, it wouldn't have to do this - and just pass the same pointer to doSomething() each time.

When I use an inout param, that is 2x as fast and doesn't re-initialize each time. However I don't see why passing something immutably wouldn't be as fast.

This definitely seems like a place where we ought to be able to peephole the extra copies away. Mind filing a bug?

-Joe

- Karl

asm from perf:

  2.71 │50:┌─→xorps %xmm0,%xmm0 ▒
  8.06 │ │ movaps %xmm0,-0x20(%rbp) ▒
  2.71 │ │ movaps %xmm0,-0x30(%rbp) ▒
  7.41 │ │ movaps %xmm0,-0x40(%rbp) ▒
10.59 │ │ movaps %xmm0,-0x50(%rbp) ▒
10.00 │ │ movaps %xmm0,-0x60(%rbp) ▒
  9.53 │ │ movaps %xmm0,-0x70(%rbp) ▒
10.65 │ │ movaps %xmm0,-0x80(%rbp) ▒
11.24 │ │ movaps %xmm0,-0x90(%rbp) ▒
12.06 │ │ mov %r14,%rdi ▒
  3.41 │ │→ callq _TF4main11doSomethingFVS_3FooT_ ▒
  2.82 │ │ dec %rbx ▒
  8.82 │ └──jne 50

main.swift:

struct Vec4 {
    var a: Int64 = 0
    var b: Int64 = 0
    var c: Int64 = 0
    var d: Int64 = 0
}

struct Foo {
    var x: Vec4 = Vec4()
    var y: Vec4 = Vec4()
    var z: Vec4 = Vec4()
    var u: Vec4 = Vec4()
}

func test() {
    precondition(sizeof(Foo) == 128)

    let s = Foo()
    for _ in 0..<100_000_000 {
        doSomething(s)
    }
}

test()

lib.swift:

func doSomething(s: Foo) {
    precondition(s.x.a != 1)
}
_______________________________________________
swift-users mailing list
swift-users@swift.org <mailto:swift-users@swift.org>
https://lists.swift.org/mailman/listinfo/swift-users


(Slava Pestov) #6

I created https://bugs.swift.org/browse/SR-110 for the compiler. However, I also think that the documentation needs an issue filed. (Where to do that at?)

Thanks!

The current docs say structs are always copied (the only exception being inout to memory variable optimization). That would make programmers worried about speed and stack usage run away screaming, and not give swift a try.

If weren't already been chased away by 'vars are always allocated on the heap'. The docs generally discuss high-level semantic behavior rather than the real code emitted; in general, users can count on structs being copied whenever necessary to preserve value semantics between different names. How would you suggest rewording the documentation?

-Joe

Perhaps the docs should instead talk about how the lifetime of the var's value might extend beyond the return from its scope, without explicitly saying anything about stack or heap allocation, which as you note is entirely implementation detail.

···

On Dec 7, 2015, at 10:36 AM, Joe Groff via swift-users <swift-users@swift.org> wrote:

On Dec 7, 2015, at 10:27 AM, Karl Pickett <karl.pickett@gmail.com <mailto:karl.pickett@gmail.com>> wrote:

On Mon, Dec 7, 2015 at 11:53 AM, Joe Groff <jgroff@apple.com <mailto:jgroff@apple.com>> wrote:

On Dec 6, 2015, at 5:16 PM, Karl Pickett via swift-users <swift-users@swift.org <mailto:swift-users@swift.org>> wrote:

I have a struct and this code:

func test() {
    precondition(sizeof(Foo) == 128)

    let s = Foo()
    for _ in 0..<100_000_000 {
        doSomething(s)
    }
}

The asm (on LInux, with -O) is showing me that s is being re-initialized on every iteration of the loop. I was hoping that thanks to swift's strict constness rules on structs, it wouldn't have to do this - and just pass the same pointer to doSomething() each time.

When I use an inout param, that is 2x as fast and doesn't re-initialize each time. However I don't see why passing something immutably wouldn't be as fast.

This definitely seems like a place where we ought to be able to peephole the extra copies away. Mind filing a bug?

-Joe

- Karl

asm from perf:

  2.71 │50:┌─→xorps %xmm0,%xmm0 ▒
  8.06 │ │ movaps %xmm0,-0x20(%rbp) ▒
  2.71 │ │ movaps %xmm0,-0x30(%rbp) ▒
  7.41 │ │ movaps %xmm0,-0x40(%rbp) ▒
10.59 │ │ movaps %xmm0,-0x50(%rbp) ▒
10.00 │ │ movaps %xmm0,-0x60(%rbp) ▒
  9.53 │ │ movaps %xmm0,-0x70(%rbp) ▒
10.65 │ │ movaps %xmm0,-0x80(%rbp) ▒
11.24 │ │ movaps %xmm0,-0x90(%rbp) ▒
12.06 │ │ mov %r14,%rdi ▒
  3.41 │ │→ callq _TF4main11doSomethingFVS_3FooT_ ▒
  2.82 │ │ dec %rbx ▒
  8.82 │ └──jne 50

main.swift:

struct Vec4 {
    var a: Int64 = 0
    var b: Int64 = 0
    var c: Int64 = 0
    var d: Int64 = 0
}

struct Foo {
    var x: Vec4 = Vec4()
    var y: Vec4 = Vec4()
    var z: Vec4 = Vec4()
    var u: Vec4 = Vec4()
}

func test() {
    precondition(sizeof(Foo) == 128)

    let s = Foo()
    for _ in 0..<100_000_000 {
        doSomething(s)
    }
}

test()

lib.swift:

func doSomething(s: Foo) {
    precondition(s.x.a != 1)
}
_______________________________________________
swift-users mailing list
swift-users@swift.org <mailto:swift-users@swift.org>
https://lists.swift.org/mailman/listinfo/swift-users

_______________________________________________
swift-users mailing list
swift-users@swift.org <mailto:swift-users@swift.org>
https://lists.swift.org/mailman/listinfo/swift-users


(Jens Alfke) #7

If weren't already been chased away by 'vars are always allocated on the heap'. The docs generally discuss high-level semantic behavior rather than the real code emitted; in general, users can count on structs being copied whenever necessary to preserve value semantics between different names. How would you suggest rewording the documentation?

Perhaps take a look at the Go language documentation for ideas, since it has similar semantics. Here’s an explanation I found in their FAQ:

How do I know whether a variable is allocated on the heap or the stack?

From a correctness standpoint, you don't need to know. Each variable in Go exists as long as there are references to it. The storage location chosen by the implementation is irrelevant to the semantics of the language.

The storage location does have an effect on writing efficient programs. When possible, the Go compilers will allocate variables that are local to a function in that function's stack frame. However, if the compiler cannot prove that the variable is not referenced after the function returns, then the compiler must allocate the variable on the garbage-collected heap to avoid dangling pointer errors. Also, if a local variable is very large, it might make more sense to store it on the heap rather than the stack.

In the current compilers, if a variable has its address taken, that variable is a candidate for allocation on the heap. However, a basic escape analysis recognizes some cases when such variables will not live past the return from the function and can reside on the stack.

https://golang.org/doc/faq#stack_or_heap

—Jens

···

On Dec 7, 2015, at 10:36 AM, Joe Groff via swift-users <swift-users@swift.org> wrote:


(Karl Pickett) #8

I created https://bugs.swift.org/browse/SR-110 for the compiler.
However, I also think that the documentation needs an issue filed. (Where
to do that at?)

Thanks!

The current docs say structs are always copied (the only exception being
inout to memory variable optimization). That would make programmers
worried about speed and stack usage run away screaming, and not give swift
a try.

If weren't already been chased away by 'vars are always allocated on the
heap'. The docs generally discuss high-level semantic behavior rather than
the real code emitted; in general, users can count on structs being copied
whenever necessary to preserve value semantics between different names. How
would you suggest rewording the documentation?

I'm find with the docs being broad and semantic but a 'NOTE: Copying might
be elided; see "Swift Optimizations" for details' would be helpful.
Clearly the semantics are being carefully designed to enable certain
optimizations, but if we're only seeing the semantics, we're not seeing the
whole picture.

I did a search for "heap" and found nothing in the 2.2 ebook. I recall
seeing heap->stack optimizations for closures in some video and didn't go
digging further because it seemed like the language designer(s) had a
solution. I think that's a separate issue but yes having details would be
great.

Consider a user who wants to write some C-style swift (using stack and
heap) and wants to know "what am I being forced to pay for in Swift, that C
doesn't have". The docs often contrast with C and I think that's great.

So the "Swift Optimizations" would cover heap/stack, struct copying for
func calls (pure structs, and cases where they have embedded classes), and
inouts to function or global vars. And returning a struct (does caller or
callee allocate, and/or copy). And ARC. Even if an optimization isn't
currently implemented, it could say "the language was designed for X
optimization to be possible so don't assume Y".

I'm also unsure how inlining works across modules/libraries. whole-module
is great but I assume we can't rely on that in the general case, unless we
have c++ like "header only" libraries with full source.

Swift seems really great (the Linux build worked with no problems for me);
there's just a lack of "deep dive" material at the moment, so I'm left
hanging wondering what the gotchas are with using it for a systems project.
At the moment it's a few clues and hints to go off of.

- Karl

···

On Mon, Dec 7, 2015 at 12:36 PM, Joe Groff <jgroff@apple.com> wrote:

On Dec 7, 2015, at 10:27 AM, Karl Pickett <karl.pickett@gmail.com> wrote:

-Joe

On Mon, Dec 7, 2015 at 11:53 AM, Joe Groff <jgroff@apple.com> wrote:

On Dec 6, 2015, at 5:16 PM, Karl Pickett via swift-users < >> swift-users@swift.org> wrote:

I have a struct and this code:

func test() {
    precondition(sizeof(Foo) == 128)

    let s = Foo()
    for _ in 0..<100_000_000 {
        doSomething(s)
    }
}

The asm (on LInux, with -O) is showing me that s is being re-initialized
on every iteration of the loop. I was hoping that thanks to swift's strict
constness rules on structs, it wouldn't have to do this - and just pass the
same pointer to doSomething() each time.

When I use an inout param, that is 2x as fast and doesn't re-initialize
each time. However I don't see why passing something immutably wouldn't be
as fast.

This definitely seems like a place where we ought to be able to peephole
the extra copies away. Mind filing a bug?

-Joe

- Karl

asm from perf:

  2.71 │50:┌─→xorps %xmm0,%xmm0
                                                 ▒
  8.06 │ │ movaps %xmm0,-0x20(%rbp)
                                                 ▒
  2.71 │ │ movaps %xmm0,-0x30(%rbp)
                                                 ▒
  7.41 │ │ movaps %xmm0,-0x40(%rbp)
                                                 ▒
10.59 │ │ movaps %xmm0,-0x50(%rbp)
                                                 ▒
10.00 │ │ movaps %xmm0,-0x60(%rbp)
                                                 ▒
  9.53 │ │ movaps %xmm0,-0x70(%rbp)
                                                 ▒
10.65 │ │ movaps %xmm0,-0x80(%rbp)
                                                 ▒
11.24 │ │ movaps %xmm0,-0x90(%rbp)
                                                 ▒
12.06 │ │ mov %r14,%rdi
                                                 ▒
  3.41 │ │→ callq _TF4main11doSomethingFVS_3FooT_
                                                 ▒
  2.82 │ │ dec %rbx
                                                  ▒
  8.82 │ └──jne 50

main.swift:

struct Vec4 {
    var a: Int64 = 0
    var b: Int64 = 0
    var c: Int64 = 0
    var d: Int64 = 0
}

struct Foo {
    var x: Vec4 = Vec4()
    var y: Vec4 = Vec4()
    var z: Vec4 = Vec4()
    var u: Vec4 = Vec4()
}

func test() {
    precondition(sizeof(Foo) == 128)

    let s = Foo()
    for _ in 0..<100_000_000 {
        doSomething(s)
    }
}

test()

lib.swift:

func doSomething(s: Foo) {
    precondition(s.x.a != 1)
}
_______________________________________________
swift-users mailing list
swift-users@swift.org
https://lists.swift.org/mailman/listinfo/swift-users