Performance vs C++, first things to look for?

I wrote some computational geometry code - computing line segment intersections in bulk, and triangulating polygons. The data sets are pretty big and it's got some speed issues.

As an experiment, I took a couple days and ported it to C++. I pasted the code, replaced Array with std::vector, added 1000 semicolons – nothing fancy. It immediately ran almost 10x faster. I'm no C++ guru but I see a lot of places where I can control things there and try to optimize further, but in the Swift I'm kind of clueless on this front.

I'm wondering:

  1. Are there some key things to check when trying to get Swift to run faster? I've never really used inline annotations annotations for example.
  2. I have some generic types that work with Float, Double, SIMD2<Float>, SIMD2<Double>. I wonder if something is getting boxed and not specialized, even in release build. How to tell?
  3. Is it realistic to get close to C++ performance for this kind of code, that is more computer graphics and number crunching than my typical app code. Perhaps C++ is the better tool for the job here.

The first thing is almost always to run with a sampling profiler and see where the time is actually being spent. If you're on a Mac, that means doing a "time profile" with Instruments. Other similar tools are available on Linux or Windows or other platforms.

The zeroth thing to check is that you're running in release configuration, rather than debug.

After you do those, it will likely be reasonably clear what you need to chip away on. There's really no reason why this sort of code can't be as fast as or faster than C++. Post the results of your profiling and we can advise you from there.

5 Likes

Thanks. Yes, it's a release build. I ran it in Instruments "Time Profiler" and immediately got a low hanging fruit with a String I was formatting for logging, which was not necessary.

But now I'm not sure how to read this Instruments display. Looks like some time spent in my RedBlackBST class, but it's spread out. Should I post a textual version of that screen, or... ??

Make sure to select “invert call tree”, as that often yields more actionable information.

5 Likes

In most examples I've seen of comparative benchmark differences, it isn't a matter of "knobs to twist", but rather a case of a clearly new allocation in Swift (of a class, tree of classes, array, string), often happening in a tight loop, that just isn't happening in the C++ version. Look at each line and wonder "is this creating/copying a new thing (class, array, tree etc) where in the C++ version it's just mutating an existing thing in-place?".

Once you get into "how do I do it in place" then sometimes you need to start considering more advanced techniques in Swift. But often it's literally a mutating method instead of an immutable one that returns a fresh copy.

(Steve is right though that often the fastest way to spot this is an inverted call tree in instruments)

11 Likes

The Time Profiler appears to be sampling, so I guess that's why I don't see the exact number of calls to a function. Is there a way to get that? I guess it would slow down the program, but it might be useful in combination with this less intrusive profile. Also, any idea what "Self Weight" is vs "Weight"?

Show the structs you use, perhaps a few code fragments, or a GitHub repo, then we can suggest what to improve. For that domain (geometry app) I'd say this:

  • don't use strings in your structs
  • a struct of 3 or 100 floats is better than Array.
  • a struct is better than a class
  • preallocated memory is faster than allocating / deallocating.

In some areas C++ would be faster, e.g. you may preallocate an arena with one big malloc and then use placement new operators to allocate nodes without paying for memory allocation - while this is possible to express in Swift using unsafe pointers the code to do that would look complex and far from nice compared to C++. (Perhaps there's a way to hide that complexity into some wrappers.)

For ultimate speed you may consider leaving that code as C/C++. Unless you have some policy or another good reason not to. But yet, without too much time and effort you can get reasonably close to C++ performance in swift without losing too much of its swiftness.

"Weight" is the amount of time spent in that function + everything it calls. "Self weight" is the time spent specifically in that function.

2 Likes

I've found this book helpful in writing efficient algorithms: Optimizing Collections · objc.io

Computational geometry tends to work better with an UnsafeBufferPointer instead of an array so you can skip things like bounds checks. Whenever the Swift Ownership Manifesto gets implemented, it should help write implementations with performance closer to C++.

1 Like