problems with simple binary-trees program

This simple binary-trees program seems to cause problems for Swift on Ubuntu 15.10

http://benchmarksgame.alioth.debian.org/u64q/program.php?test=binarytrees&lang=swift&id=1

Any suggestions what the problem might be?

Winners of this test have pooled memory (apr_pools) and tail recursion.

Here's a version that's 5X faster:

-david (https://github.com/AE9RB/SwiftGL\)

···

On Sat, Dec 19, 2015 at 10:34 AM, Isaac Gouy via swift-users < swift-users@swift.org> wrote:

This simple binary-trees program seems to cause problems for Swift on
Ubuntu 15.10

http://benchmarksgame.alioth.debian.org/u64q/program.php?test=binarytrees&lang=swift&id=1

Any suggestions what the problem might be?

fwiw your 5X faster version also exceeds the one hour time-out.

n=12 5.034secs
n=16 1308.819secs
n=20 >3605.874secs

Compare with these n=20 measurements --

http://benchmarksgame.alioth.debian.org/u64q/performance.php?test=binarytrees

···

On Saturday, December 19, 2015 2:39 PM, David Turnbull <dturnbull@gmail.com> wrote:

Here's a version that's 5X faster:
The Computer Language Benchmarks Game - Swift binary trees · GitHub

A simple Swift program that completed the workloads within an hour time-out would be an incredible improvement :-)

Instead of looking at the "Winners" compare with the TypeScript program

http://benchmarksgame.alioth.debian.org/u64q/program.php?test=binarytrees⟨=typescript&id=2

Why does a Swift transliteration of that program perform so slowly?

···

On Saturday, December 19, 2015 2:39 PM, David Turnbull <dturnbull@gmail.com> wrote:

Winners of this test have pooled memory (apr_pools) and tail recursion.

Winners of this test have pooled memory (apr_pools) and tail recursion.

A simple Swift program that completed the workloads within an hour time-out would be an incredible improvement :-)

Instead of looking at the "Winners" compare with the TypeScript program

http://benchmarksgame.alioth.debian.org/u64q/program.php?test=binarytrees⟨=typescript&id=2

Why does a Swift transliteration of that program perform so slowly?

In one word: ARC.
This is pretty much a worst case example for ARC and the execution time is basically just ARC overhead.

To make this faster you'd have to improve on this.
The "simplest" solution would be to simply use UnsafePointers instead of classes or indirect enums.
Basically just rewrite a C example in Swift; not pretty but it should also perform very much like C then.

_______________________________________________
swift-users mailing list
swift-users@swift.org
https://lists.swift.org/mailman/listinfo/swift-users

- Janosch

···

On 20 Dec 2015, at 08:48, Isaac Gouy via swift-users <swift-users@swift.org> wrote:
On Saturday, December 19, 2015 2:39 PM, David Turnbull <dturnbull@gmail.com> wrote:

A simple Swift program that completed the workloads within an hour time-out would be an incredible improvement :-)

Instead of looking at the "Winners" compare with the TypeScript program

http://benchmarksgame.alioth.debian.org/u64q/program.php?test=binarytrees⟨=typescript&id=2

Why does a Swift transliteration of that program perform so slowly?

Both of these implementations are slow because they always create binary trees with a depth of maxDepth instead of, well, the correct depth.
Changing the following lines (and the respective lines in David's implementation) leads to a more reasonable runtime:

check += bottomUpTree(i,maxDepth).check()
check += bottomUpTree(-i,maxDepth).check()

to

check += bottomUpTree(i,depth).check()
check += bottomUpTree(-i,depth).check()

On my MacBook (i5, 2.6GHz) I get:

n=20
David: ~16 - 17s
Isaac: ~120 - 126s

Because all of the winners are doing it:
Here is a quickly thrown together parallel version of David's implementation using libdispatch (contains also the fixes from above):
https://gist.github.com/psclde/8244c4350b1d4fd1b24b

This version takes about 7 - 8s for n=20.

But libdispatch is currently not working on Linux so it's not really relevant for the benchmark.

Pascal

...

Both of these implementations are slow because they always create binary trees

with a depth of maxDepth instead of, well, the correct depth.

Thank you!

As always, I suspected I'd made a dumb mistake - but just wasn't seeing it.

···

On Sunday, December 20, 2015 9:58 AM, Pascal Urban <mail@pscl.de> wrote:

I was able to speed up David’s solution by almost 50% by changing the method checkTree from this:

func checkTree(t: Array<TreeNodeItem>, _ i: Int) -> Int

to this:

func checkTree(inout t: Array<TreeNodeItem>, _ i: Int) -> Int

It completes in about ~10s instead of ~20s on my 2.66GHz i5 iMac for n=20.
This also works together with Pascal’s libdispatch solution. In this case it completes in ~5s.
Here is the modified version: The Computer Language Benchmarks Game - Swift binary trees · GitHub

Could somebody explain to me why this is the case? I know what the inout keyword does, but I don’t understand why it makes the code faster in this case?

Maurus

···

On 20.12.2015, at 19:08, Isaac Gouy via swift-users <swift-users@swift.org> wrote:

On Sunday, December 20, 2015 9:58 AM, Pascal Urban <mail@pscl.de> wrote:

...

Both of these implementations are slow because they always create binary trees

with a depth of maxDepth instead of, well, the correct depth.

Thank you!

As always, I suspected I'd made a dumb mistake - but just wasn't seeing it.
_______________________________________________
swift-users mailing list
swift-users@swift.org
https://lists.swift.org/mailman/listinfo/swift-users

I was able to speed up David’s solution by almost 50% by changing the method checkTree from this:

func checkTree(t: Array<TreeNodeItem>, _ i: Int) -> Int

to this:

func checkTree(inout t: Array<TreeNodeItem>, _ i: Int) -> Int

It completes in about ~10s instead of ~20s on my 2.66GHz i5 iMac for n=20.
This also works together with Pascal’s libdispatch solution. In this case it completes in ~5s.
Here is the modified version: The Computer Language Benchmarks Game - Swift binary trees · GitHub

Could somebody explain to me why this is the case? I know what the inout keyword does, but I don’t understand why it makes the code faster in this case?

Maurus

_______________________________________________
swift-users mailing list
swift-users@swift.org
https://lists.swift.org/mailman/listinfo/swift-users

Like Janosch said: ARC overhead

So here is whats happening:
A function reference parameter gets retained before calling the function and then released within the called function (generally), so the function is taking ownership of the parameter. [1]
If a parameter is marked with the inout keyword the function does not take ownership of the reference parameter, so no retains and releases. [2]

Given how often checkTree is called the retains and releases of the array get quiet expensive.
You can see this quiet nicely while profiling these functions in Instruments using the time profiler.

I think normally this shouldn't be a problem, but because of the recursion the compiler is not able to optimize the retain and release calls.

Another solution could be some kind of wrapper struct/class which has the array as a property and checkTree as a method.
This way checkTree would only access the property and additional retains/releases would not be necessary.

[1] https://github.com/apple/swift/blob/master/docs/SIL.rst#reference-counts
[2] https://github.com/apple/swift/blob/master/docs/SIL.rst#inout-arguments

···

On 20.12.2015, at 19:54, Maurus Kühne via swift-users <swift-users@swift.org> wrote:

On 20.12.2015, at 17:12, Janosch Hildebrand via swift-users <swift-users@swift.org> wrote:

Basically just rewrite a C example in Swift; not pretty but it should also perform very much like C then.

Someone has done this and is now in second place:
http://benchmarksgame.alioth.debian.org/u64q/program.php?test=binarytrees&lang=swift&id=5

I was able to speed up David’s solution by almost 50% by changing the method checkTree from this:

func checkTree(t: Array<TreeNodeItem>, _ i: Int) -> Int

to this:

func checkTree(inout t: Array<TreeNodeItem>, _ i: Int) -> Int

It completes in about ~10s instead of ~20s on my 2.66GHz i5 iMac for n=20.
This also works together with Pascal’s libdispatch solution. In this case it completes in ~5s.
Here is the modified version: The Computer Language Benchmarks Game - Swift binary trees · GitHub

Could somebody explain to me why this is the case? I know what the inout keyword does, but I don’t understand why it makes the code faster in this case?

Maurus

_______________________________________________
swift-users mailing list
swift-users@swift.org <mailto:swift-users@swift.org>
https://lists.swift.org/mailman/listinfo/swift-users

Like Janosch said: ARC overhead

So here is whats happening:
A function reference parameter gets retained before calling the function and then released within the called function (generally), so the function is taking ownership of the parameter. [1]
If a parameter is marked with the inout keyword the function does not take ownership of the reference parameter, so no retains and releases. [2]

Given how often checkTree is called the retains and releases of the array get quiet expensive.
You can see this quiet nicely while profiling these functions in Instruments using the time profiler.

I think normally this shouldn't be a problem, but because of the recursion the compiler is not able to optimize the retain and release calls.

This is not true. We can hit this case. It requires further work in the ARC optimizer. We some time ago actually did have the functionality to hit this case, but I had to disable it b/c of some correctness issues.

File a bug with this test case and assign to me.

Michael

···

On Dec 22, 2015, at 2:31 PM, Pascal Urban via swift-users <swift-users@swift.org> wrote:

On 20.12.2015, at 19:54, Maurus Kühne via swift-users <swift-users@swift.org <mailto:swift-users@swift.org>> wrote:

Another solution could be some kind of wrapper struct/class which has the array as a property and checkTree as a method.
This way checkTree would only access the property and additional retains/releases would not be necessary.

[1] https://github.com/apple/swift/blob/master/docs/SIL.rst#reference-counts
[2] https://github.com/apple/swift/blob/master/docs/SIL.rst#inout-arguments

On 20.12.2015, at 17:12, Janosch Hildebrand via swift-users <swift-users@swift.org <mailto:swift-users@swift.org>> wrote:

Basically just rewrite a C example in Swift; not pretty but it should also perform very much like C then.

Someone has done this and is now in second place:
http://benchmarksgame.alioth.debian.org/u64q/program.php?test=binarytrees&lang=swift&id=5 <http://benchmarksgame.alioth.debian.org/u64q/program.php?test=binarytrees&lang=swift&id=5&gt;

_______________________________________________
swift-users mailing list
swift-users@swift.org <mailto:swift-users@swift.org>
https://lists.swift.org/mailman/listinfo/swift-users

I was able to speed up David’s solution by almost 50% by changing the method checkTree from this:

func checkTree(t: Array<TreeNodeItem>, _ i: Int) -> Int

to this:

func checkTree(inout t: Array<TreeNodeItem>, _ i: Int) -> Int

It completes in about ~10s instead of ~20s on my 2.66GHz i5 iMac for n=20.
This also works together with Pascal’s libdispatch solution. In this case it completes in ~5s.
Here is the modified version: The Computer Language Benchmarks Game - Swift binary trees · GitHub

Could somebody explain to me why this is the case? I know what the inout keyword does, but I don’t understand why it makes the code faster in this case?

Maurus

_______________________________________________
swift-users mailing list
swift-users@swift.org <mailto:swift-users@swift.org>
https://lists.swift.org/mailman/listinfo/swift-users

Like Janosch said: ARC overhead

So here is whats happening:
A function reference parameter gets retained before calling the function and then released within the called function (generally), so the function is taking ownership of the parameter. [1]
If a parameter is marked with the inout keyword the function does not take ownership of the reference parameter, so no retains and releases. [2]

Given how often checkTree is called the retains and releases of the array get quiet expensive.
You can see this quiet nicely while profiling these functions in Instruments using the time profiler.

I think normally this shouldn't be a problem, but because of the recursion the compiler is not able to optimize the retain and release calls.

This is not true. We can hit this case. It requires further work in the ARC optimizer. We some time ago actually did have the functionality to hit this case, but I had to disable it b/c of some correctness issues.

File a bug with this test case and assign to me.

Nice. I figured that this would be optimized in the future.
Filed as SR-356 with a simplified test case.

···

On 22.12.2015, at 21:40, Michael Gottesman <mgottesman@apple.com> wrote:

On Dec 22, 2015, at 2:31 PM, Pascal Urban via swift-users <swift-users@swift.org <mailto:swift-users@swift.org>> wrote:

On 20.12.2015, at 19:54, Maurus Kühne via swift-users <swift-users@swift.org <mailto:swift-users@swift.org>> wrote:

Michael

Another solution could be some kind of wrapper struct/class which has the array as a property and checkTree as a method.
This way checkTree would only access the property and additional retains/releases would not be necessary.

[1] https://github.com/apple/swift/blob/master/docs/SIL.rst#reference-counts
[2] https://github.com/apple/swift/blob/master/docs/SIL.rst#inout-arguments

On 20.12.2015, at 17:12, Janosch Hildebrand via swift-users <swift-users@swift.org <mailto:swift-users@swift.org>> wrote:

Basically just rewrite a C example in Swift; not pretty but it should also perform very much like C then.

Someone has done this and is now in second place:
http://benchmarksgame.alioth.debian.org/u64q/program.php?test=binarytrees&lang=swift&id=5 <http://benchmarksgame.alioth.debian.org/u64q/program.php?test=binarytrees&lang=swift&id=5&gt;

_______________________________________________
swift-users mailing list
swift-users@swift.org <mailto:swift-users@swift.org>
https://lists.swift.org/mailman/listinfo/swift-users

I was able to speed up David’s solution by almost 50% by changing the method checkTree from this:

func checkTree(t: Array<TreeNodeItem>, _ i: Int) -> Int

to this:

func checkTree(inout t: Array<TreeNodeItem>, _ i: Int) -> Int

It completes in about ~10s instead of ~20s on my 2.66GHz i5 iMac for n=20.
This also works together with Pascal’s libdispatch solution. In this case it completes in ~5s.
Here is the modified version: The Computer Language Benchmarks Game - Swift binary trees · GitHub

Could somebody explain to me why this is the case? I know what the inout keyword does, but I don’t understand why it makes the code faster in this case?

Maurus

_______________________________________________
swift-users mailing list
swift-users@swift.org <mailto:swift-users@swift.org>
https://lists.swift.org/mailman/listinfo/swift-users

Like Janosch said: ARC overhead

So here is whats happening:
A function reference parameter gets retained before calling the function and then released within the called function (generally), so the function is taking ownership of the parameter. [1]
If a parameter is marked with the inout keyword the function does not take ownership of the reference parameter, so no retains and releases. [2]

Given how often checkTree is called the retains and releases of the array get quiet expensive.
You can see this quiet nicely while profiling these functions in Instruments using the time profiler.

I think normally this shouldn't be a problem, but because of the recursion the compiler is not able to optimize the retain and release calls.

This is not true. We can hit this case. It requires further work in the ARC optimizer. We some time ago actually did have the functionality to hit this case, but I had to disable it b/c of some correctness issues.

File a bug with this test case and assign to me.

Nice. I figured that this would be optimized in the future.
Filed as SR-356 with a simplified test case.

Thanks!

Michael

···

On Dec 23, 2015, at 11:32 AM, Pascal Urban <mail@pscl.de> wrote:

On 22.12.2015, at 21:40, Michael Gottesman <mgottesman@apple.com <mailto:mgottesman@apple.com>> wrote:

On Dec 22, 2015, at 2:31 PM, Pascal Urban via swift-users <swift-users@swift.org <mailto:swift-users@swift.org>> wrote:

On 20.12.2015, at 19:54, Maurus Kühne via swift-users <swift-users@swift.org <mailto:swift-users@swift.org>> wrote:

Michael

Another solution could be some kind of wrapper struct/class which has the array as a property and checkTree as a method.
This way checkTree would only access the property and additional retains/releases would not be necessary.

[1] https://github.com/apple/swift/blob/master/docs/SIL.rst#reference-counts
[2] https://github.com/apple/swift/blob/master/docs/SIL.rst#inout-arguments

On 20.12.2015, at 17:12, Janosch Hildebrand via swift-users <swift-users@swift.org <mailto:swift-users@swift.org>> wrote:

Basically just rewrite a C example in Swift; not pretty but it should also perform very much like C then.

Someone has done this and is now in second place:
http://benchmarksgame.alioth.debian.org/u64q/program.php?test=binarytrees&lang=swift&id=5 <http://benchmarksgame.alioth.debian.org/u64q/program.php?test=binarytrees&lang=swift&id=5&gt;

_______________________________________________
swift-users mailing list
swift-users@swift.org <mailto:swift-users@swift.org>
https://lists.swift.org/mailman/listinfo/swift-users