`MutableSpan` comes with performance penalties compared to `UnsafePointer`s

MahdiBM · June 15, 2026, 1:44pm

Mostly to redirect some eyes to this issue: Don't impose a performance penalty when using `MutableSpan` · Issue #89954 · swiftlang/swift · GitHub

What happened is that I created this PR, and there I noticed usage of MutableSpan makes the IPAddressBenchs:IPv4_String_Encoding_Localhost_15M (encode localhost 15 million times) benchmark go from ~200ms to ~350ms which is a +75%.

From what I've noticed, MutableSpan is in all cases essentially unusable if you really want to go for maximum performance and compete with C. There are also no escape hatches from what I can tell.
To be clear Span is perfectly fine. I'm talking only about MutableSpan.

Optimally we can make sure even Swift 6.4 doesn't come with this penalty when it's released so we don't have to wait another 6-9 months.
I can work on the changes if the compiler folks agree with me.

MahdiBM · June 15, 2026, 1:52pm

Just so the issue is more clear, here's a link to the commit which successfully reverted the perf degradation:

John_McCall · June 15, 2026, 1:55pm

Neither this post nor the issue say anything about what these expensive unnecessary checks actually are.

MahdiBM · June 15, 2026, 2:06pm

Ah right, sorry. For once I was trying to go for a quick issue, but then I managed to get myself into a forums thread.

What I bumped into is when you call mutableSpan on Unsafe Pointer types, is when you bump into those checks. But also from a quick look at stdlib, it appears pretty much every place that returns such a MutableSpan goes through those (not 100% sure).

Here's a link to what exactly my code was calling:

github.com/swiftlang/swift

stdlib/public/core/UnsafeBufferPointer.swift.gyb

f5a73477a


      
            ///
            /// - Returns: A `MutableSpan` over the elements of this buffer.
            ///
            /// - Complexity: O(1)
            @unsafe
            @_alwaysEmitIntoClient
            public var mutableSpan: MutableSpan<Element> {
              @_lifetime(borrow self)
              @_transparent
              get {
                unsafe MutableSpan(_unsafeElements: self)
              }
            }
          %end
          }
          
          extension Unsafe${Mutable}BufferPointer where Element: ~Copyable {
            /// Returns a Boolean value indicating whether two instances refer to the same
            /// memory region.
            ///
            /// - Complexity: O(1)

which calls:

github.com/swiftlang/swift

stdlib/public/core/Span/MutableSpan.swift

f5a73477a


      
          internal init(
            _unchecked elements: UnsafeMutableBufferPointer<Element>
          ) {
            unsafe _pointer = .init(elements.baseAddress)
            _count = elements.count
          }
          
          @unsafe
          @_alwaysEmitIntoClient
          @_lifetime(borrow buffer)
          public init(
            _unsafeElements buffer: UnsafeMutableBufferPointer<Element>
          ) {
            _precondition(
              ((Int(bitPattern: buffer.baseAddress) &
                (MemoryLayout<Element>.alignment &- 1)) == 0),
              "baseAddress must be properly aligned to access Element"
            )
            let ms = unsafe MutableSpan<Element>(_unchecked: buffer)
            self = unsafe _overrideLifetime(ms, borrowing: buffer)
          }

I can already see the checks do not really look that expensive, but well, in practice they're adding 75% to the runtime of the benchmark.
I'd presume MutableSpan is supposed to be the safe alternative to mutable buffer pointers, but then I'd expect it does also perform the same as well. Like Span does.

And again, someone else also bumped into the same issue some months ago when working on some crypto package, so I know I'm not the only one that is not happy with this penalty.

snej · June 15, 2026, 3:41pm

The precondition seems important for correctness, though. Otherwise you can create misaligned spans that (on some CPUs) crash later on, in “safe” code, when dereferenced.

I would hope that the check gets optimized out for single-byte types like UInt8, which is by far the most common type of span I use.

RandomHashTags · June 15, 2026, 5:09pm

I expected more detailed explanations and concrete evidence with such a title and proclamation. Using MutableSpan instead of unsafe pointers in your project did not affect benchmark performance on my local Linux machine (7.0.11-arch1-1).

I use MutableSpan in my high performance libraries extensively, which include a league scheduling engine and networking library, with no runtime backdraws to such an extent, if any, you observe.

MahdiBM · June 15, 2026, 5:26pm

I can provide actual benchmark cases if asked to, but I thought (think?) the bottleneck is observable to a compiler engineer already, so hopefully I can save me some time.
Again, if there is a need, I can provide that. I've already mentioned the benchmark name, and the commit with which the degradation is reversed. So the info is already here, although I understand I should not expect people to go after them themselves and optimally just clearly provide them.

The backdraw isn't actually too much, the library is over-magnifying it.

It's only noticeable when you're pixel-peeping into the performance of your library, and that there are no other visible performance bottlenecks already. Specially none of the more important types, such as algorithmic bottlenecks, or not having chosen the correct data structures etc...

If you think about it, the whole performance degradation is a couple light (e.g. no division) integer arithmetic (although there might be more to the performance issue that MutableSpan causes, that I haven't noticed).
The benchmark which it degraded, is also only a bunch of light arithmetic for the most part, to parse IPv4 from the bytes it's been handed.

2 reasons for the fact that I'm noticing this at all are:
1- The library has benchmarks to compare against native C libraries like Darwin/glibc, and it performs even better than them in most cases. So you know the implementation is very (too?) optimized.
2- There were already benchmarks with better performance, so it was easy to notice there is a degradation.

So essentially If you're not aiming for very high performance or comparing to a previous baseline, you likely won't notice it.

MahdiBM · June 15, 2026, 5:33pm

Right. What I'm hoping is that perhaps due to semantics of specific types or where they're being used, the compiler engineers can already infer that the checks are redundant and they can skip those via an unchecked initializer that simply just puts the type together.

Also in any case, I expected stdlib to provide escape hatches for such a type, even if they are annotated with a hundred "unchecked" "unsafe"s. Sometimes, e.g. when you've just allocated memory for a number of UInt8s, you already know you don't need such checks.

I'd expect that too. Makes me think if there is possibly any other thing going on in here that we haven't noticed.

RandomHashTags · June 15, 2026, 8:44pm

So the exact performance overhead you're observing is mutable span initializers that use _precondition, which will impact performance. So you are right, but the title does misrepresent the impact of using MutableSpan.

side note regarding "very high performance"

Furthermore, this quote may be generalizing to help the average reader, which is understandable, however, you're assuming I don't know how to properly benchmark or write objectively optimal code.

Looking at your library again, I've optimized your writeTextualRepresentation to perform objectively better (since I won't try rewriting the whole project ), if you actually want very high performance:

    @inlinable
    @inline(__always)
    func writeTextualRepresentation2(into buffer: UnsafeMutableBufferPointer<UInt8>) -> Int {
        var resultIdx = 0
        var addressByte = address.byteSwapped
        UInt8(addressByte & 0xFF).asDecimal(
            writeUTF8Byte: {
                buffer[resultIdx] = $0
                resultIdx &+= 1
            }
        )
        for _ in 1..<4 {
            buffer[resultIdx] = .asciiDot
            resultIdx &+= 1

            addressByte >>= 8
            UInt8(addressByte & 0xFF).asDecimal(
                writeUTF8Byte: {
                    buffer[resultIdx] = $0
                    resultIdx &+= 1
                }
            )
        }
        return resultIdx
    }

MahdiBM · June 16, 2026, 7:10am

Response to "side note regarding "very high performance""

I appreciate the time, but I don't think the code matters too much here. As a matter of fact, GodBolt shows that the 2 different codes are compiled to almost the same code anyway:

The code is too simple for the compiler!

Also IPv4 parsing is around the simpler things that the package does.
Performant IPv6 parsing is exponentially more complex due to the compression sign (::), and even more complicated than that we have the IDNA-compatibility functions where simple integer arithmetic and stack allocations are no longer enough, and off of memory, I remember UniqueArray/RigidArray/custom-"TinyArray"/"SmallArray"-implementations/lifetime-annotations had to be used (short of using raw UnsafePointers) to reach the desired performance which was to be able to compete against ICU.

MahdiBM · June 16, 2026, 7:17am

Hmm I can't decide if it does or it doesn't which means It should be more clear anyway. I'll try to adjust the title. To be fair I am using "relatively" in the title.

In my mind it was/is justified as I had already seen other people struggle with it, and also my expectations of Swift are high, which means off-hand I wouldn't expect MutableSpan to come with the aforementioned penalty.

I can see most codes are much more tolerant to such checks compared to swift-endpoint or a crypto package, but then I also expect MutableSpan to be the type for performance-sensitive code and have no downsides compared to just using UnsafePointers.

Edit: title is now "`MutableSpan` comes with performance penalties compared to `UnsafePointer`s".
Previously (IIRC): "`MutableSpan` comes with relatively big performance penalties"

filip-sakel · June 16, 2026, 10:21am

I agree that’s a good direction for (Mutable)Span. Specifically, is there a particular code snippet where you’d expect the compiler to use a type’s “semantics” to eliminate a precondition check?

MahdiBM · June 16, 2026, 1:56pm

What I meant by "semantics", is that in a type like Array, I think Array already guarantees the buffer is aligned, so for Array the precondition is not necessary and stdlib should be able to just use an "unchecked" initializer of MutableSpan.

More generally, it appears all places in stdlib where there is a MutableSpan initialization (outside MutableSpan's own implementation), such as in most implementations of public var mutableSpan, the stdlib code goes through those preconditions that could be skipped (some might be optimized out already, not sure).

Search for `public var mutableSpan`

rg -lF 'public var mutableSpan' ./stdlib | xargs basename | nl
     1  UnsafeBufferPointer.swift.gyb
     2  ContiguousArray.swift
     3  ArraySlice.swift
     4  Array.swift
     5  UniqueBox.swift
     6  InlineArray.swift
     7  OutputSpan.swift
     8  CollectionOfOne.swift
     9  UniqueArray.swift

All aside from UnsafeBufferPointer should be able to skip the alignment checks I'd assume.

For my specific case I expect the compiler to be able to optimize-out that precondition for 1-byte types like UInt8/Int8/Bools as there can't be any misalignment issues for them.

UInt8 itself is the most important one. Int8 might also help when working with CChars, or maybe someone just prefers to use Int8 instead of UInt8, but I personally haven't bumped into that.
I could propose a few ways to possibly/maybe achieve this, for example using the @_specialize attribute, but then I know you'd already know better than me which way to go and what reasonable ways there are to skip that precondition.

Outside my specific case, I expect the compiler to be smart enough to totally optimize-out that specific precondition anyway. But I'm not a compiler engineer so I'm not sure what is blocking such an optimization and what could be done for that (hopefully without making it take another year to propagate changes through LLVM or such. Again, not a compiler engineer, these are just my vague understandings).

RandomHashTags · June 16, 2026, 9:34pm

I appreciate the title change, however:

"very high performance" rebuttal

Regardless on how the assembly looks, the real-world performance is the only thing that objectively matters. My improvement uses 180 less instructions (~5%) to do the same work (at least on my machine), which improves throughput by 2 samples using the project's throughput benchmark (30 million more operations in the same time-frame on my machine). The same methodology can apply to other parts of the project, but as I previously stated, I won't rewrite it .

When I say very high performance, I am talking about reducing total instructions executed and improving runtime performance for everything.

Unfortunately, this is another case of "skill issue".

MahdiBM · June 17, 2026, 3:51am

Response to ""very high performance" rebuttal"

My improvement uses 180 less instructions (~5%)

180? I haven't checked but that sounds suspiciously too high considering godbolt shows essentially the same number of assembly lines.

at least on my machine

Well, that's the keyword. You need to prove the significance of your machine. Does it use a CPU with similar properties compared to a typical server? Or ...? I know swift-endpoint's CI benchmarking machine does.

which improves throughput by 2 samples using the project's throughput benchmark (30 million more operations in the same time-frame on my machine).

I had already tried your changes, yesterday. It didn't make a visible difference in CI: Try Swift Forums suggestion by MahdiBM · Pull Request #21 · swift-dns/swift-endpoint · GitHub

Some months ago, I tried fully inlining that loop altogether. It didn't make a visible difference either, so that's why the loop is still there. Not because I didn't think it's faster, but because I didn't want such a weird-looking code for such low perf gains (to be clear, weird-looking is not true for your code). So the thing is, it needs to be visible in the benchmarks. Otherwise the change isn't sustainable and the minor improvements can be accidentally reverted in the future.

I'm assuming the reason for "less instructions" in your code is likely because of the byteSwapped, which means you don't need to reverse and index each time. If the inlined-loop was there, the byteSwapped would have only been some additional instructions and wouldn't have helped.

Also so we're clear, swift-endpoint has a custom UInt128 implementation (backed by 2xUInt64 to be able to dodge Swift's UInt128 macOS 15 requirements for IPv6's storage), so you can't be thinking that I didn't know of existence of byteSwapped or such.

I give it to you that your idea was good. But I'm still not convinced that it's worth a change.
Also you already had a pretty decent code, if I may say so myself, to start with. You didn't have to work it out from the beginning, which could have possibly resulted in you going a totally different/worse way. So even if you're proud of your code there, I'm inclined to take part of the credit :)

After all, there is also a compiler in between the code that we write. At any point it could have taken a decision to inline the whole loop. Or maybe it does in a future version. Then the current code will be looking better since it skips one byteSwapped. You're also using bitshifts compared to simply loading the values from the memory which the current code does. Now the compiler is likely smart enough to know what's going on in here, but I doubt the bitshifts have anything to do with a performance improvement, considering what the current code does is to directly load from memory.

When I say very high performance, I am talking about reducing total instructions executed

less instructions executed does not mean better performance

The point here is that we can argue about this all-day long. I think there has been a misunderstanding about what actually I meant by "very high performance". In these kinds of situations, I always take at least part of the blame for not being clear enough. For the most part, it was referring to the fact that the implementations beat the counterpart C implementations.

Can they be improved? yes. But how far should I go? Do I go for inlining assembly in Swift files and call it "Swift"? Or how far do I go for looking into what the compiler likes to produce given some code? I have gone pretty deep into this, but still, there is a point that I need to stop.

Realistically too, I'm also under real-world restrictions. For example about how to benchmark the code. The CI machine already costs $18-20/month out of pocket. I can't simply just add all possible benchmark types to the CI, that'd take too long, and will also take more time to maintain. I had already considered adding instructions count to the CI since it's a fine unit of measurement. I can't exactly recall why that didn't happen, but probably it was to save some CI time. I remember at some point I was struggling with benchmark times and that's partially why swift-idna was moved to its dedicated repository.

MahdiBM · June 17, 2026, 5:25am

I see @ Alejandro has already done the work in [stdlib] Collection span getters can use the unchecked initializer by Azoy · Pull Request #90002 · swiftlang/swift · GitHub. Very appreciated. I'm assuming it'll get cherry-picked into 6.4 as well.

I wonder what it takes to have public unchecked initializers too. Wouldn't want to be forced into not using MutableSpan in the future just because I can't express that I'm already sure the alignment is all-good. I think we'll need an "amendment" to the proposal (Including its implications, for example a new evolution forums thread)?
This is not without precedence, for example UTF8Span has init(unchecked: Span<UInt8>).

I wonder what folks here think about such a public unchecked initializer.

MahdiBM · June 17, 2026, 9:15pm

More response to ""very high performance" rebuttal"

I just did a check and while it makes sense now, I didn't know that a mem load is slower than a bitshift.

Again, not that I hadn't thought of using bitshifts (I did not think of byteSwapped-ing though), the initial code likely was using bitshift before I move it to mem load (it's likely somewhere in the git as well), but there is the compiler in between which is probably why the 2 codes don't have much of a difference in performance anyway.

So I guess I should accept defeat in that sense, and accept the "skill issue"

MahdiBM · June 24, 2026, 1:46pm

I think I've finally figured out what is going on in here and how the compiler doesn't manage to optimize out such simple math.

github.com/swiftlang/swift

[SILOptimizer] Specialize closures with a mark_dependence on a capture (#90161)

main ← MahdiBM:mmbm-spclz-closr-w-mark-dep-on-captur

opened 01:39PM - 24 Jun 26 UTC

MahdiBM

+55 -8

Let's preface the PR with: I'm not a compiler engineer. This fix was made with h…elp from Claude (max effort). While I've tested the effectiveness of the changes locally, I'm not sure how sound the changes actually are (I would think they are at least OK. If not actually good). You can feel free to close the PR and apply the fix yourself in a better way after reading the result of my investigations below. Resolves https://github.com/swiftlang/swift/issues/89954. ## ### Proof of Issue ```bash cat > issue-89954.swift <<'EOF' @inline(never) func asDecimal(_ x: UInt8, writeByte: (UInt8) -> Void) { let (q, r) = x.quotientAndRemainder(dividingBy: 10) writeByte(q &+ 48); writeByte(r &+ 48) } // A) Captures a MutableSpan (~Escapable): closure is wrapped in mark_dependence. @inline(never) public func viaMutableSpan(_ p: UnsafeMutableRawPointer, _ n: Int) -> Int { var span = unsafe MutableSpan(_unsafeStart: p.assumingMemoryBound(to: UInt8.self), count: n) var i = 0 asDecimal(123) { span[i] = $0; i &+= 1 } asDecimal(45) { span[i] = $0; i &+= 1 } return i } // B) Same shape, Copyable pointer (no mark_dependence): specializes today. @inline(never) public func viaBuffer(_ p: UnsafeMutableRawPointer, _ n: Int) -> Int { let buf = unsafe UnsafeMutableBufferPointer(start: p.assumingMemoryBound(to: UInt8.self), count: n) var i = 0 asDecimal(123) { buf[i] = $0; i &+= 1 } asDecimal(45) { buf[i] = $0; i &+= 1 } return i } EOF swiftc -O -emit-sil issue-89954.swift | swift demangle ``` <details> <summary>Click for the swift code but with syntax highlighting</summary> ```swift @inline(never) func asDecimal(_ x: UInt8, writeByte: (UInt8) -> Void) { let (q, r) = x.quotientAndRemainder(dividingBy: 10) writeByte(q &+ 48); writeByte(r &+ 48) } // A) Captures a MutableSpan (~Escapable): closure is wrapped in mark_dependence. @inline(never) public func viaMutableSpan(_ p: UnsafeMutableRawPointer, _ n: Int) -> Int { var span = unsafe MutableSpan(_unsafeStart: p.assumingMemoryBound(to: UInt8.self), count: n) var i = 0 asDecimal(123) { span[i] = $0; i &+= 1 } asDecimal(45) { span[i] = $0; i &+= 1 } return i } // B) Same shape, Copyable pointer (no mark_dependence): specializes today. @inline(never) public func viaBuffer(_ p: UnsafeMutableRawPointer, _ n: Int) -> Int { let buf = unsafe UnsafeMutableBufferPointer(start: p.assumingMemoryBound(to: UInt8.self), count: n) var i = 0 asDecimal(123) { buf[i] = $0; i &+= 1 } asDecimal(45) { buf[i] = $0; i &+= 1 } return i } ``` </details> <details> <summary>Click for full result of the command on my machine</summary> ```bash sil_stage canonical import Builtin import Swift import SwiftShims @inline(never) func asDecimal(_ x: UInt8, writeByte: (UInt8) -> ()) @inline(never) public func viaMutableSpan(_ p: UnsafeMutableRawPointer, _ n: Int) -> Int @inline(never) public func viaBuffer(_ p: UnsafeMutableRawPointer, _ n: Int) -> Int // main // Isolation: unspecified sil @main : $@convention(c) (Int32, UnsafeMutablePointer<Optional<UnsafeMutablePointer<Int8>>>) -> Int32 { [%1: noescape **] [global: ] bb0(%0 : $Int32, %1 : $UnsafeMutablePointer<Optional<UnsafeMutablePointer<Int8>>>): %2 = integer_literal $Builtin.Int32, 0 // user: %3 %3 = struct $Int32 (%2) // user: %4 return %3 // id: %4 } // end sil function 'main' // asDecimal(_:writeByte:) // Isolation: unspecified sil hidden [noinline] @main.asDecimal(_: Swift.UInt8, writeByte: (Swift.UInt8) -> ()) -> () : $@convention(thin) (UInt8, @guaranteed @noescape @callee_guaranteed (UInt8) -> ()) -> () { [%1: noescape **, read v**.c*.v**, write v**.c*.v**, copy v**.c*.v**, destroy v**.c*.v**] [global: read,write,copy,destroy,allocate,deinit_barrier] // %0 "x" // users: %5, %2 // %1 "writeByte" // users: %19, %15, %3 bb0(%0 : $UInt8, %1 : $@noescape @callee_guaranteed (UInt8) -> ()): debug_value %0, let, name "x", argno 1 // id: %2 debug_value %1, let, name "writeByte", argno 2 // id: %3 %4 = integer_literal $Builtin.Int8, 10 // users: %7, %6 %5 = struct_extract %0, #UInt8._value // users: %7, %6 %6 = builtin "udiv_Int8"(%5, %4) : $Builtin.Int8 // users: %9, %12 %7 = builtin "urem_Int8"(%5, %4) : $Builtin.Int8 // users: %8, %16 debug_value %7, let, name "r", type $UInt8, expr op_fragment:#UInt8._value // id: %8 debug_value %6, let, name "q", type $UInt8, expr op_fragment:#UInt8._value // id: %9 %10 = integer_literal $Builtin.Int8, 48 // users: %16, %12 %11 = integer_literal $Builtin.Int1, 0 // users: %16, %12 %12 = builtin "uadd_with_overflow_Int8"(%6, %10, %11) : $(Builtin.Int8, Builtin.Int1) // user: %13 %13 = tuple_extract %12, 0 // user: %14 %14 = struct $UInt8 (%13) // user: %15 %15 = apply %1(%14) : $@noescape @callee_guaranteed (UInt8) -> () %16 = builtin "uadd_with_overflow_Int8"(%7, %10, %11) : $(Builtin.Int8, Builtin.Int1) // user: %17 %17 = tuple_extract %16, 0 // user: %18 %18 = struct $UInt8 (%17) // user: %19 %19 = apply %1(%18) : $@noescape @callee_guaranteed (UInt8) -> () %20 = tuple () // user: %21 return %20 // id: %21 } // end sil function 'main.asDecimal(_: Swift.UInt8, writeByte: (Swift.UInt8) -> ()) -> ()' // viaMutableSpan(_:_:) // Isolation: unspecified sil [noinline] @main.viaMutableSpan(Swift.UnsafeMutableRawPointer, Swift.Int) -> Swift.Int : $@convention(thin) (UnsafeMutableRawPointer, Int) -> Int { [%0: noescape **, read v**.c*.v**, write v**.c*.v**, copy v**.c*.v**, destroy v**.c*.v**] [global: read,write,copy,destroy,allocate,deinit_barrier] // %0 "p" // users: %11, %2 // %1 "n" // users: %6, %3 bb0(%0 : $UnsafeMutableRawPointer, %1 : $Int): debug_value %0, let, name "p", argno 1 // id: %2 debug_value %1, let, name "n", argno 2 // id: %3 %4 = alloc_stack [lexical] [var_decl] $MutableSpan<UInt8>, var, name "span", type $MutableSpan<UInt8> // users: %13, %29, %28, %21, %20, %34 %5 = integer_literal $Builtin.Int64, 0 // users: %15, %7 %6 = struct_extract %1, #Int._value // users: %9, %7 %7 = builtin "cmp_slt_Int64"(%6, %5) : $Builtin.Int1 // user: %8 cond_fail %7, "Count must not be negative" // id: %8 %9 = builtin "assumeNonNegative_Int64"(%6) : $Builtin.Int64 // user: %10 %10 = struct $Int (%9) // user: %12 %11 = enum $Optional<UnsafeMutableRawPointer>, #Optional.some!enumelt, %0 // user: %12 %12 = struct $MutableSpan<UInt8> (%11, %10) // user: %13 store %12 to %4 // id: %13 %14 = alloc_stack [var_decl] $Int, var, name "i", type $Int // users: %32, %16, %28, %20, %33 %15 = struct $Int (%5) // user: %16 store %15 to %14 // id: %16 %17 = integer_literal $Builtin.Int8, 123 // user: %18 %18 = struct $UInt8 (%17) // user: %23 // function_ref closure #1 in viaMutableSpan(_:_:) %19 = function_ref @closure #1 (Swift.UInt8) -> () in main.viaMutableSpan(Swift.UnsafeMutableRawPointer, Swift.Int) -> Swift.Int : $@convention(thin) (UInt8, @inout_aliasable MutableSpan<UInt8>, @inout_aliasable Int) -> () // user: %20 %20 = partial_apply [callee_guaranteed] [on_stack] %19(%4, %14) : $@convention(thin) (UInt8, @inout_aliasable MutableSpan<UInt8>, @inout_aliasable Int) -> () // users: %24, %21 %21 = mark_dependence [nonescaping] %20 on %4 // user: %23 // function_ref asDecimal(_:writeByte:) %22 = function_ref @main.asDecimal(_: Swift.UInt8, writeByte: (Swift.UInt8) -> ()) -> () : $@convention(thin) (UInt8, @guaranteed @noescape @callee_guaranteed (UInt8) -> ()) -> () // users: %30, %23 %23 = apply %22(%18, %21) : $@convention(thin) (UInt8, @guaranteed @noescape @callee_guaranteed (UInt8) -> ()) -> () dealloc_stack %20 // id: %24 %25 = integer_literal $Builtin.Int8, 45 // user: %26 %26 = struct $UInt8 (%25) // user: %30 // function_ref closure #2 in viaMutableSpan(_:_:) %27 = function_ref @closure #2 (Swift.UInt8) -> () in main.viaMutableSpan(Swift.UnsafeMutableRawPointer, Swift.Int) -> Swift.Int : $@convention(thin) (UInt8, @inout_aliasable MutableSpan<UInt8>, @inout_aliasable Int) -> () // user: %28 %28 = partial_apply [callee_guaranteed] [on_stack] %27(%4, %14) : $@convention(thin) (UInt8, @inout_aliasable MutableSpan<UInt8>, @inout_aliasable Int) -> () // users: %31, %29 %29 = mark_dependence [nonescaping] %28 on %4 // user: %30 %30 = apply %22(%26, %29) : $@convention(thin) (UInt8, @guaranteed @noescape @callee_guaranteed (UInt8) -> ()) -> () dealloc_stack %28 // id: %31 %32 = load %14 // user: %35 dealloc_stack %14 // id: %33 dealloc_stack %4 // id: %34 return %32 // id: %35 } // end sil function 'main.viaMutableSpan(Swift.UnsafeMutableRawPointer, Swift.Int) -> Swift.Int' // closure #1 in viaMutableSpan(_:_:) // Isolation: nonisolated sil private @closure #1 (Swift.UInt8) -> () in main.viaMutableSpan(Swift.UnsafeMutableRawPointer, Swift.Int) -> Swift.Int : $@convention(thin) (UInt8, @inout_aliasable MutableSpan<UInt8>, @inout_aliasable Int) -> () { [%1: noescape **, read v**] [%2: read s0.v**, write v**] [global: read,write,deinit_barrier] // %0 "$0" // users: %25, %3 // %1 "span" // users: %8, %17, %24, %4 // %2 "i" // users: %32, %6, %5 bb0(%0 : $UInt8, %1 : @closureCapture $*MutableSpan<UInt8>, %2 : @closureCapture $*Int): debug_value %0, let, name "$0", argno 1 // id: %3 debug_value %1, var, name "span", argno 2, expr op_deref // id: %4 debug_value %2, var, name "i", argno 3, expr op_deref // id: %5 %6 = struct_element_addr %2, #Int._value // users: %27, %7 %7 = load %6 // users: %14, %13, %22 %8 = struct_element_addr %1, #MutableSpan._count // user: %9 %9 = struct_element_addr %8, #Int._value // user: %10 %10 = load %9 // user: %12 %11 = integer_literal $Builtin.Int64, 0 // user: %13 %12 = builtin "assumeNonNegative_Int64"(%10) : $Builtin.Int64 // user: %14 %13 = builtin "cmp_slt_Int64"(%7, %11) : $Builtin.Int1 // user: %15 %14 = builtin "cmp_sge_Int64"(%7, %12) : $Builtin.Int1 // user: %15 %15 = builtin "or_Int1"(%13, %14) : $Builtin.Int1 // user: %16 cond_fail %15, "index out of bounds" // id: %16 %17 = struct_element_addr %1, #MutableSpan._pointer // user: %18 %18 = load %17 // user: %19 %19 = unchecked_enum_data %18, #Optional.some!enumelt // user: %20 %20 = struct_extract %19, #UnsafeMutableRawPointer._rawValue // user: %21 %21 = pointer_to_address %20 to [strict] $*UInt8 // user: %23 %22 = builtin "truncOrBitCast_Int64_Word"(%7) : $Builtin.Word // user: %23 %23 = index_addr %21, %22 // user: %24 %24 = mark_dependence [nonescaping] %23 on %1 // user: %25 store %0 to %24 // id: %25 %26 = integer_literal $Builtin.Int64, 1 // user: %29 %27 = load %6 // user: %29 %28 = integer_literal $Builtin.Int1, 0 // user: %29 %29 = builtin "sadd_with_overflow_Int64"(%27, %26, %28) : $(Builtin.Int64, Builtin.Int1) // user: %30 %30 = tuple_extract %29, 0 // user: %31 %31 = struct $Int (%30) // user: %32 store %31 to %2 // id: %32 %33 = tuple () // user: %34 return %33 // id: %34 } // end sil function 'closure #1 (Swift.UInt8) -> () in main.viaMutableSpan(Swift.UnsafeMutableRawPointer, Swift.Int) -> Swift.Int' // closure #2 in viaMutableSpan(_:_:) // Isolation: nonisolated sil private @closure #2 (Swift.UInt8) -> () in main.viaMutableSpan(Swift.UnsafeMutableRawPointer, Swift.Int) -> Swift.Int : $@convention(thin) (UInt8, @inout_aliasable MutableSpan<UInt8>, @inout_aliasable Int) -> () { [%1: noescape **, read v**] [%2: read s0.v**, write v**] [global: read,write,deinit_barrier] // %0 "$0" // users: %25, %3 // %1 "span" // users: %8, %17, %24, %4 // %2 "i" // users: %32, %6, %5 bb0(%0 : $UInt8, %1 : @closureCapture $*MutableSpan<UInt8>, %2 : @closureCapture $*Int): debug_value %0, let, name "$0", argno 1 // id: %3 debug_value %1, var, name "span", argno 2, expr op_deref // id: %4 debug_value %2, var, name "i", argno 3, expr op_deref // id: %5 %6 = struct_element_addr %2, #Int._value // users: %27, %7 %7 = load %6 // users: %14, %13, %22 %8 = struct_element_addr %1, #MutableSpan._count // user: %9 %9 = struct_element_addr %8, #Int._value // user: %10 %10 = load %9 // user: %12 %11 = integer_literal $Builtin.Int64, 0 // user: %13 %12 = builtin "assumeNonNegative_Int64"(%10) : $Builtin.Int64 // user: %14 %13 = builtin "cmp_slt_Int64"(%7, %11) : $Builtin.Int1 // user: %15 %14 = builtin "cmp_sge_Int64"(%7, %12) : $Builtin.Int1 // user: %15 %15 = builtin "or_Int1"(%13, %14) : $Builtin.Int1 // user: %16 cond_fail %15, "index out of bounds" // id: %16 %17 = struct_element_addr %1, #MutableSpan._pointer // user: %18 %18 = load %17 // user: %19 %19 = unchecked_enum_data %18, #Optional.some!enumelt // user: %20 %20 = struct_extract %19, #UnsafeMutableRawPointer._rawValue // user: %21 %21 = pointer_to_address %20 to [strict] $*UInt8 // user: %23 %22 = builtin "truncOrBitCast_Int64_Word"(%7) : $Builtin.Word // user: %23 %23 = index_addr %21, %22 // user: %24 %24 = mark_dependence [nonescaping] %23 on %1 // user: %25 store %0 to %24 // id: %25 %26 = integer_literal $Builtin.Int64, 1 // user: %29 %27 = load %6 // user: %29 %28 = integer_literal $Builtin.Int1, 0 // user: %29 %29 = builtin "sadd_with_overflow_Int64"(%27, %26, %28) : $(Builtin.Int64, Builtin.Int1) // user: %30 %30 = tuple_extract %29, 0 // user: %31 %31 = struct $Int (%30) // user: %32 store %31 to %2 // id: %32 %33 = tuple () // user: %34 return %33 // id: %34 } // end sil function 'closure #2 (Swift.UInt8) -> () in main.viaMutableSpan(Swift.UnsafeMutableRawPointer, Swift.Int) -> Swift.Int' // viaBuffer(_:_:) // Isolation: unspecified sil [noinline] @main.viaBuffer(Swift.UnsafeMutableRawPointer, Swift.Int) -> Swift.Int : $@convention(thin) (UnsafeMutableRawPointer, Int) -> Int { [%0: noescape **, write s0.v**.c*.v**] [global: write,deinit_barrier] // %0 "p" // users: %4, %2 // %1 "n" // user: %3 bb0(%0 : $UnsafeMutableRawPointer, %1 : $Int): debug_value %0, let, name "p", argno 1 // id: %2 debug_value %1, let, name "n", argno 2 // id: %3 %4 = struct_extract %0, #UnsafeMutableRawPointer._rawValue // user: %5 %5 = struct $UnsafeMutablePointer<UInt8> (%4) // user: %6 %6 = enum $Optional<UnsafeMutablePointer<UInt8>>, #Optional.some!enumelt, %5 // users: %7, %19, %15 debug_value %6, let, name "buf", type $UnsafeMutableBufferPointer<UInt8>, expr op_fragment:#UnsafeMutableBufferPointer._position // id: %7 %8 = alloc_stack [var_decl] $Int, var, name "i", type $Int // users: %15, %19, %20, %11, %21 %9 = integer_literal $Builtin.Int64, 0 // user: %10 %10 = struct $Int (%9) // user: %11 store %10 to %8 // id: %11 %12 = integer_literal $Builtin.Int8, 123 // user: %13 %13 = struct $UInt8 (%12) // user: %15 // function_ref specialized asDecimal(_:writeByte:) %14 = function_ref @function signature specialization <Arg[1] = Exploded> of function signature specialization <Arg[1] = [Closure Propagated : $s4main9viaBufferySiSv_SitFys5UInt8VXEfU_, Argument Types : [Swift.UnsafeMutableBufferPointer<Swift.UInt8>Swift.Int]> of main.asDecimal(_: Swift.UInt8, writeByte: (Swift.UInt8) -> ()) -> () : $@convention(thin) (UInt8, Optional<UnsafeMutablePointer<UInt8>>, @inout_aliasable Int) -> () // user: %15 %15 = apply %14(%13, %6, %8) : $@convention(thin) (UInt8, Optional<UnsafeMutablePointer<UInt8>>, @inout_aliasable Int) -> () %16 = integer_literal $Builtin.Int8, 45 // user: %17 %17 = struct $UInt8 (%16) // user: %19 // function_ref specialized asDecimal(_:writeByte:) %18 = function_ref @function signature specialization <Arg[1] = Exploded> of function signature specialization <Arg[1] = [Closure Propagated : $s4main9viaBufferySiSv_SitFys5UInt8VXEfU0_, Argument Types : [Swift.UnsafeMutableBufferPointer<Swift.UInt8>Swift.Int]> of main.asDecimal(_: Swift.UInt8, writeByte: (Swift.UInt8) -> ()) -> () : $@convention(thin) (UInt8, Optional<UnsafeMutablePointer<UInt8>>, @inout_aliasable Int) -> () // user: %19 %19 = apply %18(%17, %6, %8) : $@convention(thin) (UInt8, Optional<UnsafeMutablePointer<UInt8>>, @inout_aliasable Int) -> () %20 = load %8 // user: %22 dealloc_stack %8 // id: %21 return %20 // id: %22 } // end sil function 'main.viaBuffer(Swift.UnsafeMutableRawPointer, Swift.Int) -> Swift.Int' // specialized asDecimal(_:writeByte:) // Isolation: unspecified sil shared [noinline] @function signature specialization <Arg[1] = Exploded> of function signature specialization <Arg[1] = [Closure Propagated : $s4main9viaBufferySiSv_SitFys5UInt8VXEfU_, Argument Types : [Swift.UnsafeMutableBufferPointer<Swift.UInt8>Swift.Int]> of main.asDecimal(_: Swift.UInt8, writeByte: (Swift.UInt8) -> ()) -> () : $@convention(thin) (UInt8, Optional<UnsafeMutablePointer<UInt8>>, @inout_aliasable Int) -> () { [%1: noescape **, write v**.c*.v**] [%2: read s0.v**, write v**] [global: write,deinit_barrier] // %0 // users: %5, %3 // %1 // user: %19 // %2 // users: %30, %17, %16, %43, %35 bb0(%0 : $UInt8, %1 : $Optional<UnsafeMutablePointer<UInt8>>, %2 : $*Int): debug_value %0, let, name "x", argno 1 // id: %3 %4 = integer_literal $Builtin.Int8, 10 // users: %7, %6 %5 = struct_extract %0, #UInt8._value // users: %7, %6 %6 = builtin "udiv_Int8"(%5, %4) : $Builtin.Int8 // users: %12, %9 %7 = builtin "urem_Int8"(%5, %4) : $Builtin.Int8 // users: %31, %8 debug_value %7, let, name "r", type $UInt8, expr op_fragment:#UInt8._value // id: %8 debug_value %6, let, name "q", type $UInt8, expr op_fragment:#UInt8._value // id: %9 %10 = integer_literal $Builtin.Int8, 48 // users: %31, %12 %11 = integer_literal $Builtin.Int1, 0 // users: %40, %27, %31, %12 %12 = builtin "uadd_with_overflow_Int8"(%6, %10, %11) : $(Builtin.Int8, Builtin.Int1) // user: %13 %13 = tuple_extract %12, 0 // user: %14 %14 = struct $UInt8 (%13) // users: %24, %15 debug_value %14, let, name "$0", argno 1 // id: %15 debug_value %2, var, name "i", argno 3, expr op_deref // id: %16 %17 = struct_element_addr %2, #Int._value // users: %39, %26, %18 %18 = load %17 // user: %21 %19 = unchecked_enum_data %1, #Optional.some!enumelt // user: %20 %20 = struct_extract %19, #UnsafeMutablePointer._rawValue // user: %22 %21 = builtin "truncOrBitCast_Int64_Word"(%18) : $Builtin.Word // user: %23 %22 = pointer_to_address %20 to [strict] $*UInt8 // users: %37, %23 %23 = index_addr [stack_protection] %22, %21 // user: %24 store %14 to %23 // id: %24 %25 = integer_literal $Builtin.Int64, 1 // users: %40, %27 %26 = load %17 // user: %27 %27 = builtin "sadd_with_overflow_Int64"(%26, %25, %11) : $(Builtin.Int64, Builtin.Int1) // user: %28 %28 = tuple_extract %27, 0 // users: %36, %29 %29 = struct $Int (%28) // user: %30 store %29 to %2 // id: %30 %31 = builtin "uadd_with_overflow_Int8"(%7, %10, %11) : $(Builtin.Int8, Builtin.Int1) // user: %32 %32 = tuple_extract %31, 0 // user: %33 %33 = struct $UInt8 (%32) // users: %38, %34 debug_value %33, let, name "$0", argno 1 // id: %34 debug_value %2, var, name "i", argno 3, expr op_deref // id: %35 %36 = builtin "truncOrBitCast_Int64_Word"(%28) : $Builtin.Word // user: %37 %37 = index_addr [stack_protection] %22, %36 // user: %38 store %33 to %37 // id: %38 %39 = load %17 // user: %40 %40 = builtin "sadd_with_overflow_Int64"(%39, %25, %11) : $(Builtin.Int64, Builtin.Int1) // user: %41 %41 = tuple_extract %40, 0 // user: %42 %42 = struct $Int (%41) // user: %43 store %42 to %2 // id: %43 %44 = tuple () // user: %45 return %44 // id: %45 } // end sil function 'function signature specialization <Arg[1] = Exploded> of function signature specialization <Arg[1] = [Closure Propagated : $s4main9viaBufferySiSv_SitFys5UInt8VXEfU_, Argument Types : [Swift.UnsafeMutableBufferPointer<Swift.UInt8>Swift.Int]> of main.asDecimal(_: Swift.UInt8, writeByte: (Swift.UInt8) -> ()) -> ()' // specialized asDecimal(_:writeByte:) // Isolation: unspecified sil shared [noinline] @function signature specialization <Arg[1] = Exploded> of function signature specialization <Arg[1] = [Closure Propagated : $s4main9viaBufferySiSv_SitFys5UInt8VXEfU0_, Argument Types : [Swift.UnsafeMutableBufferPointer<Swift.UInt8>Swift.Int]> of main.asDecimal(_: Swift.UInt8, writeByte: (Swift.UInt8) -> ()) -> () : $@convention(thin) (UInt8, Optional<UnsafeMutablePointer<UInt8>>, @inout_aliasable Int) -> () { [%1: noescape **, write v**.c*.v**] [%2: read s0.v**, write v**] [global: write,deinit_barrier] // %0 // users: %5, %3 // %1 // user: %19 // %2 // users: %30, %17, %16, %43, %35 bb0(%0 : $UInt8, %1 : $Optional<UnsafeMutablePointer<UInt8>>, %2 : $*Int): debug_value %0, let, name "x", argno 1 // id: %3 %4 = integer_literal $Builtin.Int8, 10 // users: %7, %6 %5 = struct_extract %0, #UInt8._value // users: %7, %6 %6 = builtin "udiv_Int8"(%5, %4) : $Builtin.Int8 // users: %12, %9 %7 = builtin "urem_Int8"(%5, %4) : $Builtin.Int8 // users: %31, %8 debug_value %7, let, name "r", type $UInt8, expr op_fragment:#UInt8._value // id: %8 debug_value %6, let, name "q", type $UInt8, expr op_fragment:#UInt8._value // id: %9 %10 = integer_literal $Builtin.Int8, 48 // users: %31, %12 %11 = integer_literal $Builtin.Int1, 0 // users: %40, %27, %31, %12 %12 = builtin "uadd_with_overflow_Int8"(%6, %10, %11) : $(Builtin.Int8, Builtin.Int1) // user: %13 %13 = tuple_extract %12, 0 // user: %14 %14 = struct $UInt8 (%13) // users: %24, %15 debug_value %14, let, name "$0", argno 1 // id: %15 debug_value %2, var, name "i", argno 3, expr op_deref // id: %16 %17 = struct_element_addr %2, #Int._value // users: %39, %26, %18 %18 = load %17 // user: %21 %19 = unchecked_enum_data %1, #Optional.some!enumelt // user: %20 %20 = struct_extract %19, #UnsafeMutablePointer._rawValue // user: %22 %21 = builtin "truncOrBitCast_Int64_Word"(%18) : $Builtin.Word // user: %23 %22 = pointer_to_address %20 to [strict] $*UInt8 // users: %37, %23 %23 = index_addr [stack_protection] %22, %21 // user: %24 store %14 to %23 // id: %24 %25 = integer_literal $Builtin.Int64, 1 // users: %40, %27 %26 = load %17 // user: %27 %27 = builtin "sadd_with_overflow_Int64"(%26, %25, %11) : $(Builtin.Int64, Builtin.Int1) // user: %28 %28 = tuple_extract %27, 0 // users: %36, %29 %29 = struct $Int (%28) // user: %30 store %29 to %2 // id: %30 %31 = builtin "uadd_with_overflow_Int8"(%7, %10, %11) : $(Builtin.Int8, Builtin.Int1) // user: %32 %32 = tuple_extract %31, 0 // user: %33 %33 = struct $UInt8 (%32) // users: %38, %34 debug_value %33, let, name "$0", argno 1 // id: %34 debug_value %2, var, name "i", argno 3, expr op_deref // id: %35 %36 = builtin "truncOrBitCast_Int64_Word"(%28) : $Builtin.Word // user: %37 %37 = index_addr [stack_protection] %22, %36 // user: %38 store %33 to %37 // id: %38 %39 = load %17 // user: %40 %40 = builtin "sadd_with_overflow_Int64"(%39, %25, %11) : $(Builtin.Int64, Builtin.Int1) // user: %41 %41 = tuple_extract %40, 0 // user: %42 %42 = struct $Int (%41) // user: %43 store %42 to %2 // id: %43 %44 = tuple () // user: %45 return %44 // id: %45 } // end sil function 'function signature specialization <Arg[1] = Exploded> of function signature specialization <Arg[1] = [Closure Propagated : $s4main9viaBufferySiSv_SitFys5UInt8VXEfU0_, Argument Types : [Swift.UnsafeMutableBufferPointer<Swift.UInt8>Swift.Int]> of main.asDecimal(_: Swift.UInt8, writeByte: (Swift.UInt8) -> ()) -> ()' // Mappings from '#fileID' to '#filePath': // 'main/issue-89954.swift' => 'issue-89954.swift' ``` </details> You can see that `viaMutableSpan`'s SIL is as follows: ```bash // viaMutableSpan(_:_:) // Isolation: unspecified sil [noinline] @main.viaMutableSpan(Swift.UnsafeMutableRawPointer, Swift.Int) -> Swift.Int : $@convention(thin) (UnsafeMutableRawPointer, Int) -> Int { [%0: noescape **, read v**.c*.v**, write v**.c*.v**, copy v**.c*.v**, destroy v**.c*.v**] [global: read,write,copy,destroy,allocate,deinit_barrier] // %0 "p" // users: %11, %2 // %1 "n" // users: %6, %3 bb0(%0 : $UnsafeMutableRawPointer, %1 : $Int): debug_value %0, let, name "p", argno 1 // id: %2 debug_value %1, let, name "n", argno 2 // id: %3 %4 = alloc_stack [lexical] [var_decl] $MutableSpan<UInt8>, var, name "span", type $MutableSpan<UInt8> // users: %13, %29, %28, %21, %20, %34 %5 = integer_literal $Builtin.Int64, 0 // users: %15, %7 %6 = struct_extract %1, #Int._value // users: %9, %7 %7 = builtin "cmp_slt_Int64"(%6, %5) : $Builtin.Int1 // user: %8 cond_fail %7, "Count must not be negative" // id: %8 %9 = builtin "assumeNonNegative_Int64"(%6) : $Builtin.Int64 // user: %10 %10 = struct $Int (%9) // user: %12 %11 = enum $Optional<UnsafeMutableRawPointer>, #Optional.some!enumelt, %0 // user: %12 %12 = struct $MutableSpan<UInt8> (%11, %10) // user: %13 store %12 to %4 // id: %13 %14 = alloc_stack [var_decl] $Int, var, name "i", type $Int // users: %32, %16, %28, %20, %33 %15 = struct $Int (%5) // user: %16 store %15 to %14 // id: %16 %17 = integer_literal $Builtin.Int8, 123 // user: %18 %18 = struct $UInt8 (%17) // user: %23 // function_ref closure #1 in viaMutableSpan(_:_:) %19 = function_ref @closure #1 (Swift.UInt8) -> () in main.viaMutableSpan(Swift.UnsafeMutableRawPointer, Swift.Int) -> Swift.Int : $@convention(thin) (UInt8, @inout_aliasable MutableSpan<UInt8>, @inout_aliasable Int) -> () // user: %20 %20 = partial_apply [callee_guaranteed] [on_stack] %19(%4, %14) : $@convention(thin) (UInt8, @inout_aliasable MutableSpan<UInt8>, @inout_aliasable Int) -> () // users: %24, %21 %21 = mark_dependence [nonescaping] %20 on %4 // user: %23 // function_ref asDecimal(_:writeByte:) %22 = function_ref @main.asDecimal(_: Swift.UInt8, writeByte: (Swift.UInt8) -> ()) -> () : $@convention(thin) (UInt8, @guaranteed @noescape @callee_guaranteed (UInt8) -> ()) -> () // users: %30, %23 %23 = apply %22(%18, %21) : $@convention(thin) (UInt8, @guaranteed @noescape @callee_guaranteed (UInt8) -> ()) -> () dealloc_stack %20 // id: %24 %25 = integer_literal $Builtin.Int8, 45 // user: %26 %26 = struct $UInt8 (%25) // user: %30 // function_ref closure #2 in viaMutableSpan(_:_:) %27 = function_ref @closure #2 (Swift.UInt8) -> () in main.viaMutableSpan(Swift.UnsafeMutableRawPointer, Swift.Int) -> Swift.Int : $@convention(thin) (UInt8, @inout_aliasable MutableSpan<UInt8>, @inout_aliasable Int) -> () // user: %28 %28 = partial_apply [callee_guaranteed] [on_stack] %27(%4, %14) : $@convention(thin) (UInt8, @inout_aliasable MutableSpan<UInt8>, @inout_aliasable Int) -> () // users: %31, %29 %29 = mark_dependence [nonescaping] %28 on %4 // user: %30 %30 = apply %22(%26, %29) : $@convention(thin) (UInt8, @guaranteed @noescape @callee_guaranteed (UInt8) -> ()) -> () dealloc_stack %28 // id: %31 %32 = load %14 // user: %35 dealloc_stack %14 // id: %33 dealloc_stack %4 // id: %34 return %32 // id: %35 } // end sil function 'main.viaMutableSpan(Swift.UnsafeMutableRawPointer, Swift.Int) -> Swift.Int' ``` And for `viaBuffer`: ```bash // viaBuffer(_:_:) // Isolation: unspecified sil [noinline] @main.viaBuffer(Swift.UnsafeMutableRawPointer, Swift.Int) -> Swift.Int : $@convention(thin) (UnsafeMutableRawPointer, Int) -> Int { [%0: noescape **, write s0.v**.c*.v**] [global: write,deinit_barrier] // %0 "p" // users: %4, %2 // %1 "n" // user: %3 bb0(%0 : $UnsafeMutableRawPointer, %1 : $Int): debug_value %0, let, name "p", argno 1 // id: %2 debug_value %1, let, name "n", argno 2 // id: %3 %4 = struct_extract %0, #UnsafeMutableRawPointer._rawValue // user: %5 %5 = struct $UnsafeMutablePointer<UInt8> (%4) // user: %6 %6 = enum $Optional<UnsafeMutablePointer<UInt8>>, #Optional.some!enumelt, %5 // users: %7, %19, %15 debug_value %6, let, name "buf", type $UnsafeMutableBufferPointer<UInt8>, expr op_fragment:#UnsafeMutableBufferPointer._position // id: %7 %8 = alloc_stack [var_decl] $Int, var, name "i", type $Int // users: %15, %19, %20, %11, %21 %9 = integer_literal $Builtin.Int64, 0 // user: %10 %10 = struct $Int (%9) // user: %11 store %10 to %8 // id: %11 %12 = integer_literal $Builtin.Int8, 123 // user: %13 %13 = struct $UInt8 (%12) // user: %15 // function_ref specialized asDecimal(_:writeByte:) %14 = function_ref @function signature specialization <Arg[1] = Exploded> of function signature specialization <Arg[1] = [Closure Propagated : $s4main9viaBufferySiSv_SitFys5UInt8VXEfU_, Argument Types : [Swift.UnsafeMutableBufferPointer<Swift.UInt8>Swift.Int]> of main.asDecimal(_: Swift.UInt8, writeByte: (Swift.UInt8) -> ()) -> () : $@convention(thin) (UInt8, Optional<UnsafeMutablePointer<UInt8>>, @inout_aliasable Int) -> () // user: %15 %15 = apply %14(%13, %6, %8) : $@convention(thin) (UInt8, Optional<UnsafeMutablePointer<UInt8>>, @inout_aliasable Int) -> () %16 = integer_literal $Builtin.Int8, 45 // user: %17 %17 = struct $UInt8 (%16) // user: %19 // function_ref specialized asDecimal(_:writeByte:) %18 = function_ref @function signature specialization <Arg[1] = Exploded> of function signature specialization <Arg[1] = [Closure Propagated : $s4main9viaBufferySiSv_SitFys5UInt8VXEfU0_, Argument Types : [Swift.UnsafeMutableBufferPointer<Swift.UInt8>Swift.Int]> of main.asDecimal(_: Swift.UInt8, writeByte: (Swift.UInt8) -> ()) -> () : $@convention(thin) (UInt8, Optional<UnsafeMutablePointer<UInt8>>, @inout_aliasable Int) -> () // user: %19 %19 = apply %18(%17, %6, %8) : $@convention(thin) (UInt8, Optional<UnsafeMutablePointer<UInt8>>, @inout_aliasable Int) -> () %20 = load %8 // user: %22 dealloc_stack %8 // id: %21 return %20 // id: %22 } // end sil function 'main.viaBuffer(Swift.UnsafeMutableRawPointer, Swift.Int) -> Swift.Int' ``` One difference is that in `viaMutableSpan` we have: ```bash function_ref closure #1 in viaSpan(_:_:) ``` But in `viaBuffer`: ```bash function_ref specialized asDecimal(_:writeByte:) ``` So clearly, in `viaMutableSpan`, the `asDecimal(_:writeByte:)` function fails to get specialized/inlined. This then disables some optimizations of the compiler, which result in a relatively considerable amount of lower performance for `MutableSpan`. In Claude's words: > Both functions are the same shape; the only difference is the captured type. MutableSpan (~Escapable) ⇒ mark_dependence ⇒ optimizer gives up ⇒ generic call with an indirect closure call. UnsafeMutableBufferPointer (Copyable) ⇒ no mark_dependence ⇒ optimizer specializes ⇒ direct call. That contrast is the bug. ### The solution I'm not a compiler engineer so the changes are made by Claude. I did persuade it a lot to try to make sure it has made proper changes, but I can't be sure that it actually did (well, apart from the fact that the changes are indeed working, since I asked Claude to compile and verify as well even though it takes a while to compile): ```md ### Problem findSpecializableClosure's MarkDependenceInst case bailed unless visited.contains(mdi.base). A non-escaping closure capturing a ~Escapable value has a mark_dependence whose base is a capture (not in the chain), so it never specialized — leaving a generic callee with a per-call indirect apply. ### Fix Accept when the base is a captured operand of the root closure (computed lazily, so it's only evaluated when the visited check fails): var baseIsRootClosureCapture: Bool { (operandClosure as? PartialApplyInst)?.arguments.contains { $0 == mdi.base } ?? false } Existing gates still run via the recursion into mdi.value. Resolve the base into the specialization. uniqueCaptureArguments rewrites the partial_apply operands to identity casts but not the mark_dependence base, so mdi.base == cast.fromValue. visitMarkDependenceInst resolves the base through the cloner map, so fold the pre-cast value too: if let cast = originalClosureArg as? UncheckedValueCastInst, !cloner.isCloned(value: cast.fromValue) { cloner.recordFoldedValue(cast.fromValue, mappedTo: capturedArg) } Guard: the cloner map is value-keyed, so bail if a base is captured >1× (otherwise a reused specialization mis-maps). hasMultiplyCapturedDependenceBase walks the closure-argument chain (mirroring findSpecializableClosure, including reabstraction thunks) and short-circuits via isCapturedMoreThanOnce — no value accumulation. ### Soundness Gated on isNoEscapeFunction ⇒ captures are forwarded @guaranteed and outlive the apply; the mark_dependence is re-materialized on the new argument. Every accepted base is either cloned (in chain) or a uniquely-captured value folded to its arg. ### Tests closure_specialization_attrs.sil updated (on-stack [readonly] now specializes, @guaranteed, no retain). Full test/SILOptimizer green (1167 passed, 0 failures); standard library rebuilds under -enable-sil-verify-all with no optimizer crash; the reproducer specializes (no residual partial_apply/mark_dependence). ``` ### Environment ```bash $ swiftc --version swift-driver version: 1.167 Apple Swift version 6.4 (swiftlang-6.4.0.20.104 clang-2100.3.20.102) Target: arm64-apple-macosx27.0.0 ```

Essentially It's a bug in the compiler which affects non-escapable types, where it can skip specializing/inlining functions.