Why is this apparently leaking memory?

JetForMe · March 18, 2020, 10:19am

I've written code to incrementally decrypt a file using CommonCrypto. It allocates a 64-byte aligned buffer to put the decrypted data into, and then loads it 16 MiB at a time and decrypts it into that buffer. The resulting data is correct. What I can't figure out is why, in Xcode, it appears to leak. The file I’m loading is about 380 MiB. When I first allocate the destination buffer, I can see RAM use in Xcode jump by that much. Then in each pass through the loop, it jumps by another 16 MiB (give or take). I can't tell what in this code is holding onto memory.

This is the main loop. When it reads the next chunk, memory use goes up by 16 MiB (as expected). When it calls cryptorUpdate(), memory goes up by another 16 MiB (as expected). When it loops back around to the top, memory goes down by 16 MiB. I would have expected it to go down by 32 MiB.

	public
	func
	decrypt(input inInputURL: URL)
		throws
		-> Data
	{
		if !inInputURL.isFileURL
		{
			throw Error.inputNotFile
		}
		
		//	Get the input file length. We have to seek
		//	to the end of the file then get the offset,
		//	and things changed in iOS 13.0…
		
		let fh = try FileHandle(forReadingFrom: inInputURL)
		
		var inputDataSize: UInt64 = 0
		if #available(iOS 13.0, *)
		{
			try fh.__seek(toEndReturningOffset: &inputDataSize)
			try fh.seek(toOffset: 0)
		}
		else
		{
			fh.seekToEndOfFile()
			inputDataSize = fh.offsetInFile
			fh.seek(toFileOffset: 0)
		}
		
		if inputDataSize > Int.max
		{
			throw Error.inputTooLarge
		}
		
		let cryptor = try cryptorCreate(op: CCOperation(kCCDecrypt), key: self.key, iv: self.iv)
		
		//	Allocate a 64 byte-aligned buffer for the output…
		
		let startPtr = UnsafeMutableRawPointer.allocate(byteCount: Int(inputDataSize) + kCCBlockSizeAES128, alignment: 64)
		startPtr.initializeMemory(as: UInt8.self, repeating: UInt8(0), count: Int(inputDataSize) + kCCBlockSizeAES128)
		
		//	Loop over the input file, reading data in small chunks…
		
		var destPtr = startPtr
		var outputCount = 0
		var crypting = true
		while crypting
		{
			//	Read the next chunk of data…
			
			let data = fh.readData(ofLength: kChunkSize)
			
			//	If the data read is zero bytes long, we’ve hit
			//	the end of file (having previously crypted the
			//	last of the data), so pad the resulint crypt…
			
			let clearData: Data
			if data.count == 0
			{
				clearData = try cryptorFinal(cryptor)
				crypting = false
			}
			else
			{
				clearData = try cryptorUpdate(cryptor, input: data)
			}
			
			//	Append it to the UnsafeMutableRawPointer…
			
			clearData.withUnsafeBytes
			{ clearBytes in
				destPtr.copyMemory(from: clearBytes, byteCount: clearData.count)
			}
			destPtr = destPtr.advanced(by: clearData.count)
			
			outputCount += clearData.count
		}
		
		CCCryptorRelease(cryptor)
		
		assert(outputCount <= Int(inputDataSize) + kCCBlockSizeAES128)
		let output = Data(bytesNoCopy: startPtr,
							count: outputCount,
							deallocator: .custom(
							{ inPtr, inSize in
								inPtr.deallocate()
							}))
		return output
	}

this is the incremental decryption method:

	private
	func
	cryptorUpdate(_ inCryptor: CCCryptorRef, input inData: Data)
		throws
		-> Data
	{
		let finalOutput = try inData.withUnsafeBytes
		{ (inputBytes: UnsafePointer<UInt8>) -> Data in
			var moved: Int = 0
			let outputLength = CCCryptorGetOutputLength(inCryptor, inData.count, false);
			var output = Data(count: outputLength)
			try output.withUnsafeMutableBytes
			{ [ count = output.count ] (outputBytes: UnsafeMutablePointer<UInt8>) -> () in
				let status = CCCryptorUpdate(inCryptor, inputBytes, inData.count, outputBytes, count, &moved)
				if status != kCCSuccess
				{
					throw Error.cryptoFailed(status: status)
				}
				assert(moved == outputLength)
			}
			return output
		}
		
		return finalOutput
	}

Am I doing something wrong in here that's somehow hanging onto that memory? It's acting like I have Zombie Objects enabled, but I don't (and I don't even know if that applies to Swift).

harlanhaskins · March 18, 2020, 5:51pm

Something in readData(ofLength:) is putting memory into an autorelease pool, and since you're driving the whole thing via a while loop, there's never an opportunity to drain the pool.

Wrap the body of your while loop in an autoreleasepool { ... } block, and the memory should remain stable.

JetForMe · March 19, 2020, 7:09am

Thanks, that was it! Seems like a bug in readData(ofLength:) to me, or at least its docs.

I went looking for a better alternative, but it seems this has been a pain point for several years now, with no replacement on the horizon. There is a weird alternative in iOS 13, __readDataUp(toLength:) throws. I don’t understand why new APIs are being made with underscores in the name like that (AV camera capture has one too).

In any case, that one doesn't throw exceptions, so I’m using it when I can. It does, however, have the same autorelease pool behavior.

phoneyDev · March 19, 2020, 11:31pm

This is not a bug per se. FileHandle is a Foundation class and is undoubtedly written in Obj-C. Objects returned from method calls, like your data object that comes back from readData(ofLength:, will be autoreleased in most cases. Your code is the typical example of where an autorelease pool is needed: an inner loop that allocates an unlimited number of objects. If the code was pure swift I think the compiler will only use retain and release so you don't get a buildup of objects. But Obj-C works according to different rules.

Even when I was writing Obj-C I stayed away from FileHandle. I typically used the C file functions, sometimes wrapped in a class. Not sure if there's anything better than that now in Swift.

JetForMe · March 21, 2020, 2:53am

Well, would be nice for the docs to mention it. Anyway, it's all good. No, there's no better Swift replacement, although this kinda makes me want to make one.

CTMacUser · March 21, 2020, 6:19pm

Not commenting on your original question, but would this be an easier way to check for the file size?

// Check if we have a suitable file.
guard inInputURL.isFileURL else { throw Error.inputNotFile }
guard let fileSize = try inInputURL.resourceValues(forKeys: [.fileSizeKey]).fileSize else { throw Error.inputTooLarge }

(Warning: all used a Swift playground set to macOS on my Catalina system.)

The fileSize variable is locked as Int, implying that the file size already can't exceed Int.max.

I got warnings when pasting in the cryptorUpdate function. Maybe it can be replaced by:

private func update<T: ContiguousBytes>(cryptor: CCCryptorRef, with data: T) throws -> Data {
    return try data.withUnsafeBytes { inputBytes in
        let outputLength = CCCryptorGetOutputLength(cryptor, inputBytes.count, false)
        var output = Data(count: outputLength)
        try output.withUnsafeMutableBytes { outputBytes in
            var bytesMoved = 0
            let status = CCCryptorUpdate(cryptor, inputBytes.baseAddress, inputBytes.count, outputBytes.baseAddress, outputLength, &bytesMoved)
            if status != kCCSuccess {
                throw Error.cryptoFailed(status: status)
            }
            assert(bytesMoved == outputLength)
        }
        return output
    }
}

JetForMe · March 21, 2020, 11:56pm

CTMacUser:

Not commenting on your original question, but would this be an easier way to check for the file size?
// Check if we have a suitable file.
guard inInputURL.isFileURL else { throw Error.inputNotFile }
guard let fileSize = try inInputURL.resourceValues(forKeys: [.fileSizeKey]).fileSize else { throw Error.inputTooLarge }

Probably; I considered that briefly when I was looking for a size property on FileHandle, but both seem excessively clunky.

CTMacUser:

I got warnings when pasting in the cryptorUpdate function. Maybe it can be replaced by:

private func update<T: ContiguousBytes>(cryptor: CCCryptorRef, with data: T) throws -> Data {
    return try data.withUnsafeBytes { inputBytes in
        let outputLength = CCCryptorGetOutputLength(cryptor, inputBytes.count, false)
        var output = Data(count: outputLength)
        try output.withUnsafeMutableBytes { outputBytes in
            var bytesMoved = 0
            let status = CCCryptorUpdate(cryptor, inputBytes.baseAddress, inputBytes.count, outputBytes.baseAddress, outputLength, &bytesMoved)
            if status != kCCSuccess {
                throw Error.cryptoFailed(status: status)
            }
            assert(bytesMoved == outputLength)
        }
        return output
    }
}

Do you mean 'withUnsafeBytes' is deprecated: use `withUnsafeBytes<R>(_: (UnsafeRawBufferPointer) throws -> R) rethrows -> R` instead?

My code is being compiled with Swift 4.2. It seems in Swift 5 they deprecated that form of withUnsafeBytes, but I didn't want to deal with updating the whole project, and I wasn’t sure I understand what's different well enough to ensure I'd be able to get rid of the warnings. I don't really understand what that warning is trying to tell me.

EDIT: When I initially didn't specify the type for inputBytes, I got this error in CCCryptorUpdate:

Cannot convert value of type 'UnsafeRawBufferPointer' to expected argument type 'UnsafeRawPointer?'

JetForMe · March 22, 2020, 12:12am

Hmm. I did just rewrite it as

	private
	func
	cryptorUpdate(_ inCryptor: CCCryptorRef, input inData: Data)
		throws
		-> Data
	{
		let finalOutput = try inData.withUnsafeBytes
		{ inputBytes -> Data in
			var moved: Int = 0
			let outputLength = CCCryptorGetOutputLength(inCryptor, inputBytes.count, false);
			var output = Data(count: outputLength)
			try output.withUnsafeMutableBytes
			{ outputBytes in
				let status = CCCryptorUpdate(inCryptor, inputBytes.baseAddress, inputBytes.count, outputBytes.baseAddress, outputBytes.count, &moved)
				if status != kCCSuccess
				{
					throw Error.cryptoFailed(status: status)
				}
				assert(moved == outputLength)	//	This seems to hold true, and requires adjusting the buffer if not
			}
			return output
		}
		
		return finalOutput
	}

and that seems to compile without warnings on both Swift 4.2 and 5, and it seems to run correctly. Not sure if it’s correct. I still had to specify the return type in the first closure to get it to compile. One benefit is outputBytes now has a count property, so I could get rid of the local capture.

Lantua · March 22, 2020, 12:25am

Compiler doesn't infer return type from inside multi-statement closure (yet). Likely you can return the result, without intermediate value, to get the return type from return:

return try inData. ... // no need for `finalOutput`

JetForMe · March 22, 2020, 1:14am

Lantua:

Compiler doesn't infer return type from inside multi-statement closure (yet). Likely you can return the result, without intermediate value, to get the return type from return :
return try inData. ... // no need for `finalOutput`

Oh I'm aware of the compiler’s limitation, just pointing out that the proffered solution was incomplete. And yeah, I could skip the intermediate, but it makes it easier to examine when stepping through in the debugger.