Compiler Crash with InlineArray

I was experimenting with converting an internal work buffer ([UInt8] of length 8) to an InlineArray. Though am experiencing a compiler crash in Xcode 26.0.1 (17A400) with the following simplified example:

struct Column {
	var row: InlineArray<4, UInt8> = [0, 0, 0, 0]
	
	var asData: Data {
		var data = Data()
		
		data.append(row, count: row.count)
		
		return data
	}
}

Crash:

1.	Apple Swift version 6.2 (swiftlang-6.2.0.19.9 clang-1700.3.19.1)
2.	Compiling with the current language version
3.	While evaluating request ASTLoweringRequest(Lowering AST to SIL for file "/Volumes/Lara/Instant Interactive/Archived/InlineArray/IIArchiver.swift")
4.	While silgen emitFunction SIL function "@$s11InlineArray6ColumnV6asData10Foundation0E0Vvg".
 for getter for asData (at /Volumes/Lara/Instant Interactive/Archived/InlineArray/IIArchiver.swift:14:6)
Stack dump without symbol names (ensure you have llvm-symbolizer in your PATH or set the environment var `LLVM_SYMBOLIZER_PATH` to point to it):
0  swift-frontend           0x0000000108b25bcc llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) + 56
1  swift-frontend           0x0000000108b2355c llvm::sys::RunSignalHandlers() + 112
2  swift-frontend           0x0000000108b261f8 SignalHandler(int, __siginfo*, void*) + 344
3  libsystem_platform.dylib 0x000000018e19a744 _sigtramp + 56
4  libsystem_pthread.dylib  0x000000018e190888 pthread_kill + 296
5  libsystem_c.dylib        0x000000018e096808 abort + 124
6  swift-frontend           0x0000000102b3f660 swift::DiagnosticHelper::~DiagnosticHelper() + 0
7  swift-frontend           0x0000000108a93ec4 llvm::report_fatal_error(llvm::Twine const&, bool) + 280
8  swift-frontend           0x0000000108a93dac llvm::report_fatal_error(llvm::Twine const&, bool) + 0
9  swift-frontend           0x0000000103320e5c swift::Lowering::SILGenModule::useConformance(swift::ProtocolConformanceRef) + 236
10 swift-frontend           0x0000000103322778 LazyConformanceEmitter::visitPartialApplyInst(swift::PartialApplyInst*) + 316
11 swift-frontend           0x0000000103252cf0 swift::Lowering::SILGenModule::postEmitFunction(swift::SILDeclRef, swift::SILFunction*) + 96
12 swift-frontend           0x00000001032526d4 swift::Lowering::SILGenModule::emitFunctionDefinition(swift::SILDeclRef, swift::SILFunction*) + 7636
13 swift-frontend           0x00000001032536f0 swift::Lowering::SILGenModule::emitOrDelayFunction(swift::SILDeclRef) + 236
14 swift-frontend           0x00000001032508ec swift::Lowering::SILGenModule::emitFunction(swift::FuncDecl*) + 136
15 swift-frontend           0x00000001033ba668 (anonymous namespace)::SILGenType::visitFuncDecl(swift::FuncDecl*) + 32
16 swift-frontend           0x00000001033ba868 (anonymous namespace)::SILGenType::visitAbstractStorageDecl(swift::AbstractStorageDecl*) + 248
17 swift-frontend           0x00000001033ba5d8 (anonymous namespace)::SILGenType::visitVarDecl(swift::VarDecl*) + 464
18 swift-frontend           0x00000001033b646c (anonymous namespace)::SILGenType::emitType() + 456
19 swift-frontend           0x0000000103250600 swift::ASTVisitor<swift::Lowering::SILGenModule, void, void, void, void, void, void>::visit(swift::Decl*) + 100
20 swift-frontend           0x0000000103257538 swift::ASTLoweringRequest::evaluate(swift::Evaluator&, swift::ASTLoweringDescriptor) const + 2364
21 swift-frontend           0x000000010339d61c swift::SimpleRequest<swift::ASTLoweringRequest, std::__1::unique_ptr<swift::SILModule, std::__1::default_delete<swift::SILModule>> (swift::ASTLoweringDescriptor), (swift::RequestFlags)17>::evaluateRequest(swift::ASTLoweringRequest const&, swift::Evaluator&) + 208
22 swift-frontend           0x000000010325bfc0 swift::ASTLoweringRequest::OutputType swift::Evaluator::getResultUncached<swift::ASTLoweringRequest, swift::ASTLoweringRequest::OutputType swift::evaluateOrFatal<swift::ASTLoweringRequest>(swift::Evaluator&, swift::ASTLoweringRequest)::'lambda'()>(swift::ASTLoweringRequest const&, swift::ASTLoweringRequest::OutputType swift::evaluateOrFatal<swift::ASTLoweringRequest>(swift::Evaluator&, swift::ASTLoweringRequest)::'lambda'()) + 572
23 swift-frontend           0x000000010275f708 swift::performCompileStepsPostSema(swift::CompilerInstance&, int&, swift::FrontendObserver*) + 964
24 swift-frontend           0x0000000102762a7c performCompile(swift::CompilerInstance&, int&, swift::FrontendObserver*) + 1764
25 swift-frontend           0x000000010276168c swift::performFrontend(llvm::ArrayRef<char const*>, char const*, void*, swift::FrontendObserver*) + 3580
26 swift-frontend           0x00000001026e2c6c swift::mainEntry(int, char const**) + 5412
27 dyld                     0x000000018ddd1d54 start + 7184

This is resolved on the main branch already:

<source>:9:15: error: cannot convert value of type 'InlineArray<4, UInt8>' to expected argument type 'UnsafePointer<UInt8>'
 7 | 		var data = Data()
 8 | 		
 9 | 		data.append(row, count: row.count)
   |               `- error: cannot convert value of type 'InlineArray<4, UInt8>' to expected argument type 'UnsafePointer<UInt8>'
10 | 		
11 | 		return data

We don’t have the implicit pointer conversion for InlineArray as we do for Array.

2 Likes

Great to hear, thank you!

For any particular release of Xcode, is there a way to know if it would contain fixes like this? Typically in Xcode’s release notes, there’s only a few Feedback/Radar items listed, so one has to do some trial and error to see what all is fixed.

Regarding this item, hoping it makes it in to Xcode 26.1

Only way to really know is to compare the tags or branches on GitHub, but that only really works for patch versions, the changes are way too extensive for minor releases, which are huge.

Fixes on main right now should only be expected in Swift 6.3 in the spring, with whatever Xcode version (26.3, 26.4, depends on how many releases they do). Unless they're cherry picked back to a 6.2 release branch, we won't see them any earlier.

In general, don't expect any but the most severe bug fixes in current versions of Swift, as the bar to cherry pick fixes from main to a release branch is usually very high, and requires someone to drive the inclusion of each and every potential pick.

1 Like

For instance, this comparison of the original 6.2 tag with the latest development snapshot tag, but that doesn't seem entirely accurate either. Another bit of complexity here is that Apple's Swift, which is included with Xcode, doesn't necessarily match any particular open source release, so Apple's Swift 6.2.0 isn't necessary the 6.2.0 tagged on GitHub.

Thank you for these insights.

I was hoping to compare against some baselined measurements to see what, if any, would be the performance gain. But will pause moving to InlineArray for now.

You should be able to easily do the following though:

row.span.withUnsafeBufferPointer {
  data.append($0.baseAddress!, count: $0.count)
}

Thanks; that allowed me to take the new measurements. But alas, using InlineArray vs [UInt8] leads to a 50% slowdown. I suspect it’s due to how values are getting appended to the Data instance. I ran seven trials with each algorithm to rule out any flukes; timings were consistent.

There may be cases where InlineArray will improve performance for folks. Perf should be checked though to verify.

Can you post the full source of both examples that you benchmarked? Those results are a bit surprising.

This is a stripped down version, but timings show the same slowdown:

struct IITimeMetrics {
	var start: TimeInterval
	var end: TimeInterval
	
	var localizedDuration: String {
		String(format: "%.3f", end - start)
	}
	
	static let zero = IITimeMetrics(start: 0, end: 0)
}

final class IIOriginalArchiver {
	var workBuffer = [UInt8](repeating: 0, count: 8)
	var data = Data()
	
	func writeUInt16(_ aValue: UInt16) {
		workBuffer[0] = UInt8(truncatingIfNeeded: (aValue & 0xFF00) >> 8)
		workBuffer[1] = UInt8(truncatingIfNeeded: aValue & 0x00FF)
		
		data.append(workBuffer, count: 2)
	}
}

final class IIUpdatedArchiver {
	var workBuffer: InlineArray<8, UInt8> = [0, 0, 0, 0, 0, 0, 0, 0]
	var data = Data()
	
	func writeUInt16(_ aValue: UInt16) {
		workBuffer[0] = UInt8(truncatingIfNeeded: (aValue & 0xFF00) >> 8)
		workBuffer[1] = UInt8(truncatingIfNeeded: aValue & 0x00FF)
		
		workBuffer.span.withUnsafeBufferPointer {
			data.append($0.baseAddress!, count: 2)
		}
	}
}

func measure() {
	let iterations = 10_000_000

	timeMetrics.start = Date.now.timeIntervalSinceReferenceDate
	
	let archiver = IIOriginalArchiver()
//		let archiver = IIUpdatedArchiver()
	
	for _ in 1...iterations {
		archiver.writeUInt16(0xFE01)
	}
	
	timeMetrics.end = Date.now.timeIntervalSinceReferenceDate
			
	print("Duration of \(iterations) iterations: \(timeMetrics.localizedDuration) seconds")
}

Timings (top set is using original [UInt8] implementation; bottom set using InlineArray):

Duration of 10000000 iterations: 0.378 seconds
Duration of 10000000 iterations: 0.366 seconds
Duration of 10000000 iterations: 0.388 seconds

Duration of 10000000 iterations: 0.763 seconds
Duration of 10000000 iterations: 0.803 seconds
Duration of 10000000 iterations: 0.797 seconds

Note: I measured the empty loop overload and it was negligible (0.000 seconds)

Just to be sure, these timings are from a release build, not a debug build, correct? Debug timings are usually a red herring.

If you were using a release build, an empty for loop would probably be optimized out.

Always a release build, yes. For the original code I measured, the loop had some additional overhead (~0.4 secs) so took that out of the final times. It was avg 10 seconds for [UInt8] and avg 15 seconds for InlineArray.

I won't have access to a macOS 26 machine until later today so I can't test it as thoroughly as I'd like, so I'm using Godbolt instead. Granted, that's not the best for benchmarking because it's a remote server somewhere that I don't know the specs of or its workload at any given time.

Looking at the disassembly itself, the results don't make sense. The generated assembly code for the InlineArray version is significantly shorter and less complex than the one that uses Array. The generated code looks like what I expect it would.

One thing I noticed was that if I just pulled the archiver initialization up to before the start of the benchmark timing, the numbers come a lot closer together (and sometimes the inline array version finishes faster than the array version in that case).

Could the actual culprit here be something like one-time type metadata loading for InlineArray vs Array? That's still hard to imagine though that something like that would dominate the timings by that much.

The playground I've been using: Compiler Explorer

Is it possible it's the null check in $0.baseAddress!? Changing it to $0.baseAddress.unsafelyUnwrapped makes it significantly faster for me (also testing on Godbolt).

There’s an unnecessary copy occurring when accessing the span of the inline array, but even with that it still runs faster than the array variant for me locally.

A couple of considerations:

  • as written this is memory bound task (as it writes 20 MB of data).
  • The true potential of InlineArray vs normal Array will be realised when you are recreating the array, in case of array that would mean memory allocation/deallocation which won't happen in inlineArray case.