You could put the individual tests into @inline(never) functions then it would be easier to look at what an individual test does in Xcode’s Instruments.
It is true that small changes can trigger large swings in runtime performance as a seemingly small change can for example change inlining decisions or prevent ARC operation removal and then performance is widely different.
Given that you only see a 3% regression in -O mode where we are not able to remove all the abstraction the performance numbers you report look good to me.
···
From: Károly Lőrentey via swift-evolution <swift-evolution-m3FHrko0VLzYtjvyW6yDsg@public.gmane.org>
Date: February 8, 2016 at 7:08:33 AM PST
To: Dave Abrahams <dabrahams-2kanFRK1NckAvxtiuMwx3w@public.gmane.org>
Cc: swift-evolution-m3FHrko0VLzYtjvyW6yDsg@public.gmane.org
Subject: Re: Draft: Add @noescape and rethrows to ManagedBuffer API
On 2016-02-07, at 16:18, Dave Abrahams via swift-evolution <swift-evolution@swift.org> wrote:
On first look this seems to be a great idea. Have you checked for
performance impact?
Yes, although I found it challenging to create a good
microbenchmark for this. Subtle changes in the benchmarking code
lead to large swings in runtime performance, which makes me question
the usefulness of my results.
Keeping that in mind, for a trivial ManagedBuffer subclass, I found
that @noescape makes for a ~15-18% improvement when whole module
optimization is disabled, or when the subclass is imported.
Throwing in the rethrows declarations reduces the improvement to
~9-13%, or (in the case of a particular subscript getter test) even
reverses it, making the code ~3% slower.
The proposal has no discernible impact on ManagedBuffer subclasses
that the optimizer has full access to. (I.e., when they’re defined
in the same file as the code that’s using them, or in the same
module with WMO.) Unoptimized code also seems unaffected by these
changes.
My benchmarking code is on GitHub; feedback would be very welcome:
I don't expect @noescape to have much effect on optimization yet since we don't propagate that information to SIL at all.
-Joe
···
On Feb 8, 2016, at 7:11 PM, Arnold Schwaighofer via swift-evolution <swift-evolution@swift.org> wrote:
Your benchmark code looks fine to me.
You could put the individual tests into @inline(never) functions then it would be easier to look at what an individual test does in Xcode’s Instruments.
It is true that small changes can trigger large swings in runtime performance as a seemingly small change can for example change inlining decisions or prevent ARC operation removal and then performance is widely different.
Given that you only see a 3% regression in -O mode where we are not able to remove all the abstraction the performance numbers you report look good to me.
From: Károly Lőrentey via swift-evolution <swift-evolution-m3FHrko0VLzYtjvyW6yDsg@public.gmane.org>
Date: February 8, 2016 at 7:08:33 AM PST
To: Dave Abrahams <dabrahams-2kanFRK1NckAvxtiuMwx3w@public.gmane.org>
Cc: swift-evolution-m3FHrko0VLzYtjvyW6yDsg@public.gmane.org
Subject: Re: Draft: Add @noescape and rethrows to ManagedBuffer API
On 2016-02-07, at 16:18, Dave Abrahams via swift-evolution <swift-evolution@swift.org> wrote:
On first look this seems to be a great idea. Have you checked for
performance impact?
Yes, although I found it challenging to create a good
microbenchmark for this. Subtle changes in the benchmarking code
lead to large swings in runtime performance, which makes me question
the usefulness of my results.
Keeping that in mind, for a trivial ManagedBuffer subclass, I found
that @noescape makes for a ~15-18% improvement when whole module
optimization is disabled, or when the subclass is imported.
Throwing in the rethrows declarations reduces the improvement to
~9-13%, or (in the case of a particular subscript getter test) even
reverses it, making the code ~3% slower.
The proposal has no discernible impact on ManagedBuffer subclasses
that the optimizer has full access to. (I.e., when they’re defined
in the same file as the code that’s using them, or in the same
module with WMO.) Unoptimized code also seems unaffected by these
changes.
My benchmarking code is on GitHub; feedback would be very welcome: