Optimization levels do nothing no matter where I set them

I have a 2d software renderer which means lots of loops writing pixels, and that's of course very sensitive to running in debug mode. Unfortunately the only way to build in release mode (that works) is to run swift build -c release and then manually run the executable. swift run -c release still says it's building for debugging, for whatever reason. Using the @_optimize(speed) attribute also has no effect.

This is extremely noticeable because even at a resolution of 320 by 200 debug mode is a slideshow, doing nothing but setting all pixels to black (because literally every loop Swift is accessing some witness tables for ranges). In release mode it's perfectly fine.

I'm using the latest Xcode and the included version of Swift.

Passing -Ospeed to Xswiftc also does nothing.

Setting Build Configuration to "Release" in Xcode causes running to fail, it's trying to link x86_64 versions of system libraries used (on arm macOS), something it doesn't do in any other case.

2 Likes

Tangentially, if you're able to share concise example code, some of the compiler folks here in the forums might be interested in taking a look. Maybe there's tweaks that can be made to improve the performance even of your debug builds.

Sure!

This is shown when profiling:

This is the loop:

var api = Renderer(width: Self.display.w, height: Self.display.h)
var event = SDL_Event()

loop:
while true {
    while SDL_PollEvent(&event) != 0 {
        switch event.type {
            case SDL_QUIT.rawValue: break loop
            default: break
        }
    }
    
    instance.update(input: Input(width: display.w, height: display.h))
    
    SDL_SetRenderDrawColor(renderer, 0, 0, 0, 1)
    SDL_RenderClear(renderer)
    instance.draw(renderer: &api)
    for x in 0..<api.display.width {
        for y in 0..<api.display.height {
            let color = api.display[x, y].rgb
            SDL_SetRenderDrawColor(renderer, color.r, color.g, color.b, 255)
            var rect = SDL_Rect(
                x: Int32(pixelSize * x + windowMargin),
                y: Int32(pixelSize * y + windowMargin),
                w: pixelSize, h: pixelSize
            )
            SDL_RenderFillRect(renderer, &rect)
        }
    }
    SDL_RenderPresent(renderer)
}

The profiler is also highlighting the loops themselves, especially the second one as the biggest problem. It seems to be using completely un-inlined iterators with useless bounds checking here.

I also see this problem in many other places when iterating 2D collections:

public struct Renderer: ~Copyable {
    internal var display: Display
    ...
    public mutating func rectangle(x: Int, y: Int, w: Int, h: Int, color: Color = .white, fill: Bool = false) {
        for sx in 0..<w {
            for sy in 0..<h {
                if sx + x == x || sx + x == x + w - 1 || sy + y == y || sy + y == y + h - 1 || fill {
                    self.pixel(x: sx + x, y: sy + y, color: color)
                }
            }
        }
    }
    ...
}

I can definitely see fixing this in 2 ways, one would be to use Metal or OpenGL and render to a texture streaming the buffer to the gpu instead of this amazing sdl rect renderer (although it's still surprisingly not slow). That would only fix one of the loops though. The other would probably be replacing ranges across my code with a C style loop, but those have been removed so I need an ugly while loop instead.

1 Like

I was able to figure out why nothing was working, I was accidentally passing arguments after the target name. I can now run release from my terminal.

Still doesn't explain why the optimization attribute doesn't fix these loops and why Xcode fails to run in release mode

1 Like