Memory leaking in Vapor app

Tof · March 29, 2019, 5:53am

I tried but I can not

Yes I can do that

Joannis_Orlandos · March 29, 2019, 7:40am

@iljaiwas how did you compile your application. Using swift build or swift build -c release?

If you're running a debug build, note that this line doesn't remove the future from its memory, probably causing a leak.

iljaiwas · March 29, 2019, 8:30am

I just checked with our Dockerfile and we are indeed using swift build -c release to compile the production version of our app.

Tof · March 29, 2019, 10:54am

@johannesweiss I can not build the application

root@09e8e75de965:/app# swift build --sanitize=address
Compile Swift Module 'Generator' (9 sources)
Compile Swift Module 'Debugging' (3 sources)
Compile Swift Module 'COperatingSystem' (1 sources)
Compile Swift Module 'Swiftgger' (53 sources)
Compile Swift Module 'NIOPriorityQueue' (2 sources)
Compile CNIOZlib empty.c
Compile CNIOSHA1 c_nio_sha1.c
Compile CNIOOpenSSL shims.c
Compile CNIOOpenSSL helpers.c
Compile CNIOLinux shim.c
Compile CNIOLinux ifaddrs-android.c
Compile Swift Module 'Regex' (33 sources)
Compile CNIOHTTPParser c_nio_http_parser.c
Compile CNIODarwin shim.c
Compile CNIOAtomics src/c-atomics.c
Compile CCryptoOpenSSL shim.c
Compile CBcrypt blf.c
Compile CBcrypt bcrypt.c
Compile CBase32 base32.c
Compile Swift Module 'NIOConcurrencyHelpers' (2 sources)
Compile Swift Module 'NIO' (55 sources)
Compile Swift Module 'NIOTLS' (3 sources)
Compile Swift Module 'NIOFoundationCompat' (1 sources)
Compile Swift Module 'Bits' (12 sources)
Compile Swift Module 'NIOHTTP1' (9 sources)
Compile Swift Module 'Async' (15 sources)
Compile Swift Module 'NIOOpenSSL' (17 sources)
Compile Swift Module 'Random' (4 sources)
Compile Swift Module 'Core' (25 sources)
Compile Swift Module 'NIOWebSocket' (9 sources)
Compile Swift Module 'Validation' (18 sources)
Compile Swift Module 'URLEncodedForm' (8 sources)
Compile Swift Module 'Service' (20 sources)
Compile Swift Module 'Multipart' (8 sources)
Compile Swift Module 'Logging' (4 sources)
Compile Swift Module 'Crypto' (19 sources)
Compile Swift Module 'HTTP' (26 sources)
Compile Swift Module 'TemplateKit' (41 sources)
Compile Swift Module 'Routing' (12 sources)
Compile Swift Module 'DatabaseKit' (30 sources)
Compile Swift Module 'Console' (28 sources)
Compile Swift Module 'SQL' (59 sources)
Compile Swift Module 'Redis' (20 sources)
Compile Swift Module 'Command' (16 sources)
Compile Swift Module 'WebSocket' (6 sources)
Compile Swift Module 'Fluent' (49 sources)
Compile Swift Module 'Vapor' (75 sources)
Compile Swift Module 'FluentSQL' (9 sources)
Compile Swift Module 'PostgreSQL' (73 sources)
Compile Swift Module 'FluentPostgreSQL' (17 sources)
Compile Swift Module 'App' (42 sources)
Compile Swift Module 'Run' (1 sources)
Linking ./.build/x86_64-unknown-linux/debug/Run
/app/.build/checkouts/crypto.git-7538185120456515950/Sources/CBase32/base32.c:95: error: undefined reference to '__asan_version_mismatch_check_v6'
/app/.build/checkouts/crypto.git-7538185120456515950/Sources/CBcrypt/bcrypt.c:260: error: undefined reference to '__asan_version_mismatch_check_v6'
/app/.build/checkouts/crypto.git-7538185120456515950/Sources/CBcrypt/blf.c:657: error: undefined reference to '__asan_version_mismatch_check_v6'
/app/.build/checkouts/crypto.git-7538185120456515950/Sources/CCryptoOpenSSL/shim.c:29: error: undefined reference to '__asan_version_mismatch_check_v6'
clang: error: linker command failed with exit code 1 (use -v to see invocation)
<unknown>:0: error: link command failed with exit code 1 (use -v to see invocation)
error: terminated(1): /usr/bin/swift-build-tool -f /app/.build/debug.yaml main output:

johannesweiss · March 29, 2019, 12:37pm

Sorry, could you file a bug about this at bugs.swift.org ? To work around this, maybe you could try a rm -r f .build just before the swift build --sanitize=address

johannesweiss · March 29, 2019, 6:24pm

So first of all, it's very important that we're not blaming anybody. Every participant here was very reasonable and also we don't know the final outcome yet. And thanks very much to the OP to post such a succinct reproduction, that was/is invaluable.

tl;dr

It's (probably) not a leak but there's too much memory allocated. I say probably because I could not see a leak with the demo program from the post but I can't say for sure that the OP's real production application doesn't have a leak.

long story

Okay, I now had a chance to look into this and at least the reproduction that the OP posted does not leak for me. BUT it's very important to say that for the longest time I did think that it leaked. Let me go through why.

What I did first is to use valgrind on this app, then run the Scripts/run_test.rb and after the script ran through, I pressed Ctrl+C and valgrind presented me with:

==19319== 2,478,760 bytes in 9,995 blocks are definitely lost in loss record 1,174 of 1,177
==19319==    at 0x4C2DE56: malloc (vg_replace_malloc.c:299)
==19319==    by 0x5202D51: swift_slowAlloc (in /usr/lib/swift/linux/libswiftCore.so)
==19319==    by 0x5202DBE: _swift_allocObject_(swift::TargetHeapMetadata<swift::InProcess> const*, unsigned long, unsigned long) (in /usr/lib/swift/linux/libswiftCore.so)
==19319==    by 0x50ED6FC: Swift._StringGuts.reserveUnusedCapacity(_: Swift.Int, ascii: Swift.Bool) -> () (in /usr/lib/swift/linux/libswiftCore.so)
==19319==    by 0x5D57092: generic specialization <[Swift.Character]> of Swift.String.append<A where A: Swift.Sequence, A.Element == Swift.Character>(contentsOf: A) -> () (in /usr/lib/swift/linux/libFoundation.so)
==19319==    by 0x5D54ECE: Foundation.NSData.base64EncodedString(options: Foundation.NSData.Base64EncodingOptions) -> Swift.String (in /usr/lib/swift/linux/libFoundation.so)
==19319==    by 0x5FF0FC5: partial apply forwarder for closure #1 (Foundation.NSData) -> Swift.String in Foundation.Data.base64EncodedString(options: Foundation.NSData.Base64EncodingOptions) -> Swift.String (in /usr/lib/swift/linux/libFoundation.so)
==19319==    by 0x5FEDB04: function signature specialization <Arg[0] = Exploded> of function signature specialization <Arg[1] = [Closure Propagated : reabstraction thunk helper from @callee_guaranteed (@guaranteed Foundation.NSData) -> (@unowned Foundation._NSRange, @error @owned Swift.Error) to @escaping @callee_guaranteed (@guaranteed Foundation.NSData) -> (@out Foundation._NSRange, @error @owned Swift.Error), Argument Types : [@callee_guaranteed (@guaranteed Foundation.NSData) -> (@unowned Foundation._NSRange, @error @owned Swift.Error)]> of generic specialization <Foundation._NSRange> of Foundation._DataStorage.withInteriorPointerReference<A>(Swift.Range<Swift.Int>, (Foundation.NSData) throws -> A) throws -> A (in /usr/lib/swift/linux/libFoundation.so)
==19319==    by 0x5FDF1A9: Foundation.Data.base64EncodedString(options: Foundation.NSData.Base64EncodingOptions) -> Swift.String (in /usr/lib/swift/linux/libFoundation.so)
==19319==    by 0x606CE37: Foundation.(_JSONEncoder in _12768CA107A31EF2DCE034FD75B541C9).box_<A where A: Swift.Encodable>(A) throws -> Foundation.NSObject? (in /usr/lib/swift/linux/libFoundation.so)
==19319==    by 0x6072B17: Foundation.(_JSONEncoder in _12768CA107A31EF2DCE034FD75B541C9).encode<A where A: Swift.Encodable>(A) throws -> () (in /usr/lib/swift/linux/libFoundation.so)
==19319==    by 0x607367F: protocol witness for Swift.SingleValueEncodingContainer.encode<A where A1: Swift.Encodable>(A1) throws -> () in conformance Foundation.(_JSONEncoder in _12768CA107A31EF2DCE034FD75B541C9) : Swift.SingleValueEncodingContainer in Foundation (in /usr/lib/swift/linux/libFoundation.so)

so it says a couple of Strings have been definitely lost. So I believe that and dug a big further. Also almost all BinaryData objects appeared to have leaked.

A little later, I noticed that valgrind was still running at 100% CPU for almost a minute after Scripts/run_test.rb stopped. When I waited that minute until valgrind was back to 0% CPU and pressed Ctrl+C it actually reported no leaks anymore!

So why is that? In the original post, you can see an incredible saw tooth pattern, so the app consumes a massive amount of memory to then go back to idle. valgrind makes the execution so very slow, that when I was pressing Ctrl+C I was still kind of at the top of the sawtooth. In a C program, valgrind would now have reported 'still reachable' but in Swift it's possible that valgrind reports definitely lost but in reality that's not true. Why? Because the Swift compiler is very clever and it uses the spare bits of pointers to store things like tags enum for enum values. So when valgrind tries to find a reference for say pointer value 0xdeadbe00 it won't find because through the clever packing of enum tags in the spare bits of the pointers, the pointer might be saved in memory as 0xdeadbeef (here the extra 0xef could be an enum tag). It's also interesting that it shows a String as leaked because String makes a lot of use of clever bit-packing which is pretty awesome as it gives us the great performance. Unfortunately it also confuses leak checkers that just see if a certain pointer value is present anywhere else in memory (which would form a pointer to it).

Right, I confirmed a couple of times that when I waited enough time, valgrind was always confirming 'no leaks'.

Okay, so I ditched valgrind and looked at the raw memory stats. This is from the very beginning of the program:

USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root      3412  0.1  0.6 472092 26356 pts/0    Sl+  17:56   0:00 .build/x86_64-unknown-linux/release/Run

So we see about 26MB resident size. Let's run the Scripts/run_test.rb and let's check ps again:

root      3412  2.4 10.1 865308 408780 pts/0   Sl   17:56   0:02 .build/x86_64-unknown-linux/release/Run

BAD! Now we're having 408MB resident. Again, this looks like a massive leak! Let's run the Scripts/run_test.rb again:

and let's check `ps

root      3412  3.2 11.7 865308 473468 pts/0   Sl   17:56   0:04 .build/x86_64-unknown-linux/release/Run

now we're seeing 473MB, wow even more leaks, but oddly, it's less pronounced now... One more time:

root      3412  3.8 11.9 865308 481064 pts/0   Sl   17:56   0:06 .build/x86_64-unknown-linux/release/Run

481MB, ha the leaks are getting smaller. Because memory fragmentation is a thing, I decided to use mallinfo to let the Linux allocator tell me how many bytes are actually allocated rather than what kind of memory the application requested from the kernel.

So back to square one, let's print mallinfo's numbers every 10 s. At the start of the program it prints

mcheck_malloc_stats(arena: 1482752, ordblks: 524, smblks: 85, hblks: 0, hblkhd: 0, usmblks: 0, fsmblks: 4928, uordblks: 914544, fordblks: 568208, keepcost: 263888)

we can ignore most of these cryptic numbers but uordblks: 914544 tells us the number of bytes that are actually in use from the allocator. This is just under 1MB. Let's run Scripts/run_test.rb again:

After one iteration it prints

mcheck_malloc_stats(arena: 470839296, ordblks: 627, smblks: 119, hblks: 0, hblkhd: 0, usmblks: 0, fsmblks: 7264, uordblks: 1027664, fordblks: 469811632, keepcost: 263888)

so uordblks: 1027664, that's just over 1MB. So we did grow in memory, but only veeeery slightly. Let's run the script again:

mcheck_malloc_stats(arena: 471547904, ordblks: 629, smblks: 119, hblks: 0, hblkhd: 0, usmblks: 0, fsmblks: 7184, uordblks: 1027728, fordblks: 470520176, keepcost: 263888)

ha, we didn't grow a single byte anymore: uordblks: 1027728. Let's do it another 10 times:

mcheck_malloc_stats(arena: 471572480, ordblks: 629, smblks: 119, hblks: 0, hblkhd: 0, usmblks: 0, fsmblks: 7168, uordblks: 1027728, fordblks: 470544752, keepcost: 263888)

and again: not a single byte leaked.

So what's going on: Why is this process using 25MB in kernel memory at the beginning, then grows to almost 400MB and never goes smaller again, despite the fact that only 1MB of memory is actually allocated? It's memory fragmentation. The program the OP posts, loads 10,000 objects into memory at the same time. Due to (I think) an inefficiency in Vapor's mysql driver these 10,000 objects temporarily hold on to pretty big Data objects and ByteBuffers. But they all need to fit in memory at the same time, so the program has to request almost 500MB from the kernel to fit it in. Only a very few seconds later however, it'll release all the memory but due to fragmentation pretty much all the regions we have allocated from the kernel are still in use at least partially.

That's why we see the kernel memory growing slower and slower: We start with 25MB, the go to 400MB, then 450MB, then 470MB, then 480MB. At some point we just have allocated enough kernel memory to fit the next 10,000 objects into memory again.

How can we fix it?

So the problem is that too much memory is needed at the same, it's not leaked but that's still bad because our heap grows pretty big. How can this be resolved: There are two options:

the OP's program loads 10,000 objects into memory which can be, depending on the size of the objects, be a lot. So maybe that can be reduced?
I believe (I don't have proof for this and didn't check properly) that the mysql driver allocates pretty big ByteBuffer objects and I'm not sure if they're all needed that big.

So why do I believe (without proof) that the mysql driver might allocate objects that are too big? Our maximum memory with loading 10,000 objects into memory is about 500MB. That makes (roughly) 50k per object. And that seems way too large. I have no idea what's stored in the mysql database or what exactly these 10,000 objects represent but it seems unlikely that they should be 50k large . I'm very happy to be wrong here, if each of them is some kind of image, or large body of text, or something that all makes sense but it feels like a bit much.

How could this happen that one would by accident hold onto ByteBuffer/Data that holds lots of data in memory? It's actually quite simple. Imagine you'd write a database driver. So you communicate with the database and the database sends you back lots of data, but maybe you're only interested in a very small portion. For the sake of the argument, let's say the database responds with 1k of data but you're only interested in say 4 bytes of that. Now you might do

let data: Data = byteBuffer.readData(length: 4)

But the byteBuffer might contain the whole 1k response. And because we try not to copy the bytes over, the returned Data object will reference the whole 1k ByteBuffer, despite the fact it's only '4 bytes long'.

Clearly something's wrong here. In NIO, if you want to force a copy, you can do

let data = Data(byteBuffer.viewBytes(at: byteBuffer.readerIndex, length: 4))

but I think this is way too hard. NIO's getData/readData should do a bit better. We can't be perfect here because it's impossible to decide for sure if it would be better to copy the bytes or not. But NIO should definitely have better heuristics. If for example it's less than 1024 bytes or so, we should just copy. Also if the ByteBuffer is massively larger than what we want out of it, we should also copy instead of referencing the whole ByteBuffer. I filed a bug for this.

To get to the best possible result, NIO and Vapor's database drivers need to work together. If a copy is definitely better, NIO should probably do that automatically but when Vapor's driver knows that a copy would be sensible, it should also force that copy.

Sorry, I haven't proof-read this at all and I realise it's a massive wall of text. But I'm off for the weekend (without computer) so I thought it might be more useful to send my finding here than not.

conclusion

Please verify that you can confirm my findings. You can just do

DispatchQueue(label: "print").async {
    while true {
        let info = mallinfo()
        print(info)
        sleep(10)
    }
}

to print the information every 10 seconds. But mallinfo only works on Linux and you need to create a C module to expose it to Swift.

edits/notes

when I write “kernel memory” I mean memory provided by the kernel. The user-space allocator can request a memory region from the kernel and it can also return that memory to the kernel. The problem is that it can only return the whole region. So if it requested say 1MB and after a while is only using 10 bytes from that 1MB region, then it can not return that back. And ps will show that it needs the whole 1MB
the reason this only reproduced on Linux was twofold: 1) you’d only use valgrind for leak checking on Linux 2) the usual macOS tools all report (I think) the actually allocated size and not what has been handed out by the kernel
need to check if I can get LSan to work with Swift properly

iljaiwas · March 29, 2019, 8:29pm

Thanks a lot for the detailed analysis. I'm still trying to grasp what it means for us.

FYI: Our database stores serialised objects of different types from a rather complex Mac desktop application (at least that's what we believe). The plan is to use it as a backend to facilitate synchronisation between different Macs running that software.

On average, our users have well about 1 GB of data each (not counting image data), the average size of the data blob is around 600 Bytes, but we have some outliers with more than 1 MB (think large texts). Maybe we need to store larger objects outside the database, as we do already do with images.

amosavian · March 29, 2019, 8:32pm

No offence, but I think it’s reasonable to copy when size is equal or less than inlined data size (14 bytes in 64bit architectures and 6 bytes in 32bit)

Helge_Hess1 · March 29, 2019, 9:06pm

That doesn't sound convincing. I'd expect the database to return ByteBuffers with a compact binary or at worst string representation of just the data requested (maybe in big endian, which ideally could be transformed in the same buffer instance). It is not like MySQL is always going to send you the full row even if you just request the primary key ;-)
In short: all data returned by MySQL should essentially be raw payload and all necessary.

Taking slices of the receive buffer for segmentation and not copying the buffer sounds very right to me (i.e. no copying, but backing the data vend by the driver using the buffer).

From what you wrote taking a Data view backed by the BB (why would you even do that instead of sticking to a ByteBuffer slice?) would still result in an allocation. Maybe it is this one I filed: SR-7378. When using the 'deallocator' variant of Data (i.e. "backed by a different storage"), this always results in a closure capture (malloc) with the current API (SR-7378) (which is 100% counter the purpose of this API).
Well and if this is done for every or many columns by the driver, no surprise ...

TBH: I would recommend just using the MySQL C client library which presumably is optimised to death. OK, that's not quite right, I would recommend to use PG instead :-)

For the driver, maybe avoid Data and use BB slices instead?

Helge_Hess1 · March 29, 2019, 9:23pm

I don't know that much about MySQL (and I tried to get your sample to run to have a look, but didn't manage ), but in PostgreSQL:

The storage requirement for a short string (up to 126 bytes) is 1 byte plus the actual string, ... Long strings are compressed .... Very long values are also stored in background tables

I expect MySQL to employ a similar mechanism, i.e. modern databases already do the "store larger objects outside the database" for you. It can make sense for payloads directly delivered via HTTP (e.g. images), because they can be sendfile'ed very efficiently.

The issue discussed by Johannes is due to fragmentation, i.e. unnecessarily splitting those 600 bytes into say 60+ own heap objects (by X rows). That probably warrants a fix in the driver, not in your application.

Generally fetching 10k 600 byte objects is a no brainer and should consume: 10k * 600 / 1024 = 6 MB of raw data memory. Even if you 10x that due to representation overhead, you end up w/ 60MB not 400MB ;-)

I suggest you wait a few more days whether the issue can be solved, before switching server environments. Swift is actually pretty good for those kinds of things (i.e. has structs and doesn't require as many objects as other environments, leading to fragmentation).

johannesweiss · March 29, 2019, 10:12pm

Sorry, I wasn’t precise enough here: there are cases when we know a copy is better, but in some cases you can’t automatically find the best solution. We do give the user control over it but we should document it better.

johannesweiss · March 29, 2019, 10:25pm

If you can hold fewer objects in memory at the same time that will always help. If you can’t then we should make sure that the objects that need to be in memory at the same time are as small as feasibly possible. And I’m fairly sure that the mysql driver could be optimised here.

If I were you, I’d try to reduce the number of objects resident in memory at the same time and work with the Vapor folks to make sure we’re not holding onto more bytes than necessary per object.

I’d start with making a back of the envelope calculation: how many objects do we need to keep in memory * the size of those objects. That’ll be your best possible peak memory usage.

And Swift is a great language to get the peak actual memory usage very close to the minimum theoretical required memory usage. Currently I’m pretty sure the actual peak memory usage is like an order of magnitude above the theoretical best peak.

And lastly: whilst your app at the moment might look very bloated in memory it’s important to see that the bytes are not actually lost, they can and will be re-used for future allocations, they are just not returned to the kernel. For example, in my analysis, the program has 500MB worth of pages assigned to is only using 1MB of them. To start with, that just means the program can allocate almost 499 MB for “free”, ie without asking the kernel for more memory. I’m not saying a fragmented heap is great but especially on a server it’s also not the end of the world.

johannesweiss · March 29, 2019, 10:29pm

Agreed, but do we know that through the ORM (if there’s one in use) we don’t request more? I’m not saying this can’t be fixed, just saying that it looks like a too big ByteBuffer happened to be kept alive for something rather small. But as I said, mostly guessing here.

Very much of this! Swift is really well equipped to be very tight in the memory needed. I was just trying to explain that accidentally keeping way too much in memory can happen rather easily. The issue we’re seeing here is absolutely not a Swift issue, these sorts of issues do happen in all languages.

After all, the program requested for all those bytes to be in memory, but by a few layers of abstraction.

2 allocations (1 for __DataStorage, 1 for the custom deallocator storage).

Therefore copying if the amount of bytes is below a to-be-measured threshold will help. The other benefit of copying is that Data can then use its preferred native storage.

But as you point out: the fastest would be to not translate at all and stay in ByteBuffer(View or slices). But for the problem at hand it wouldn’t change much because we’re interested in peak memory usage here.

Well, so firstly we (at least I) have no clue how many of these bytes are actually ‘needed’ for the object that we load. Secondly, is there maybe some ORM that loads in more? Thirdly, I saw some JSON encoding (which base64 encoded stuff). I don’t know if these things are necessary, or overhead... I suggest, let’s have the people who actually know what the application does and the people who know how the ORM/driver work have a look into this. And if there are further questions/bugs found/..., we can all work together to resolve them and make our ecosystem even better

Helge_Hess1 · March 29, 2019, 10:50pm

I don't think Vapor has an ORM (Object Relational Mapper ...), doesn't it just use Codable to map incoming data to arbitrary value types? So in the worst case the data should be gone after mapping? (it likely doesn't have an ORM like uniquing context, aka NSManagedObjectContext/EOEditingContext, which tends to gobble up memory :-) ).

But it sounds a little like the user may be keeping a handle to some Data column (which may then be referring to the full ByteBuffer). That might explain it. (though I guess not really, because all those Data columns in a single result set would share the same BB, which is IMO much better than copying the buffer into 10k standalone Data storage objects, maybe the root case is really the use of Data here with its dual-alloc, instead of using a BB slice).

P.S.: "But I'm off for the weekend (without computer)" - I hope you actually leave it behind ;-)

tachyonics · March 30, 2019, 7:18pm

Incidentally, my week was spent looking at memory utilisation graphs which were gradually increasing in a number of our SmokeFramework + SmokeAWS + SwiftNIO services. My investigations tally closely with the analysis by @johannesweiss that it is related to peak memory usage. I gained more insight into what I had been seeing from Johannes' comments so that has been very useful as well.

I don't want to highjack a thread about Vapor but I did wanted to mention that we are seeing similar issues and there are likely optimisations the frameworks can do to at least minimise the impact of this.

amosavian · March 30, 2019, 10:15pm

I checked the code of reading ByteBuffer as Data here:

github.com

apple/swift-nio/blob/main/Sources/NIOFoundationCompat/ByteBuffer-foundation.swift#L69-L74


      
          
          
/// Read `length` bytes off this `ByteBuffer`, move the reader index forward by `length` bytes and return the result
          /// as `Data`.
          ///
          /// - parameters:
          ///     - length: The number of bytes to be read from this `ByteBuffer`.

We are not going to copy here when data length is inlinable. I also checked Foundation source and it does not copy either (for obvious reasons),

That was only a suggestion to copy data when they are inlinable. You may oppose for other reasons as you know NIO much better than me. But I think it worth considering as inlinable data are held in the stack and there is no malloc overhead.

The code would become something like this:

    public func getData(at index0: Int, length: Int) -> Data? {
        let index = index0 - self.readerIndex
        guard index >= 0 && length >= 0 && index <= self.readableBytes - length else {
            return nil
        }
        
        let inlinableCount: Int
#if arch(x86_64) || arch(arm64) || arch(s390x) || arch(powerpc64) || arch(powerpc64le)
        inlinableCount = 14
#elseif arch(i386) || arch(arm)
        inlinableCount = 6
#endif
        
        if length <= inlinableCount {
            return  Data(bytes: UnsafeMutableRawPointer(mutating: ptr.baseAddress!.advanced(by: index)),
                        count: Int(length))
        }
        
        return self.withUnsafeReadableBytesWithStorageManagement { ptr, storageRef in
            _ = storageRef.retain()
            return Data(bytesNoCopy: UnsafeMutableRawPointer(mutating: ptr.baseAddress!.advanced(by: index)),
                        count: Int(length),
                        deallocator: .custom { _, _ in storageRef.release() })
        }
    }

lukasa · March 31, 2019, 8:12am

I think @johannesweiss and I are entirely in favour of copying into Data in some cases: don’t mistake his comments for disagreement. I think what he’s saying is that NIO should have more aggressive heuristics than you’re proposing. For example, copying up to 1024 bytes may still be cheaper overall than this behaviour. And if you have a ByteBuffer holding 100MB, copying out 1MB may be worth it if that gives the opportunity to free the underlying buffer.

In practice, the case you suggested and a general heuristic about when copying is cheap can probably be implemented by NIO. The other case (accidentally keeping the larger buffer alive) is something that users will probably have to work around, but we can definitely work to add diagnostics for when this is happening.

johannesweiss · March 31, 2019, 2:50pm

100% what @lukasa says. One option we also have is to implement heuristics with an explicit override for the user if they know what they're doing. Something like

extension ByteBuffer {
    public enum ByteTransferStrategy {
        case heuristic
        case copy // maybe `case forceCopy`
        case noCopy // maybe `case forceNoCopy`
    }

    public func getData(at index: Int, length: Int, transferStrategy: ByteTransferStrategy = .heuristic) {
        [...]
    }
}

does that sound like a plan?

Tof · April 1, 2019, 7:11am

Done with SR-10252

johannesweiss · April 1, 2019, 10:12am

Thanks @Tof. Just checked out that issue. Would you mind retrying with the Swift 5 compiler?

Why would Swift 5 help? I believe the reason that you're seeing this issue is that the Swift compiler and the clang on your system have a different address sanitizer version an they're incompatible. That was a longstanding bug in Swift that it relied on the system's clang. So stuff (like ASan) relying on a particular clang version would only work if your system happened to have the right one. Swift 5 now finally fixed that by shipping a matching clang compiler .

EDIT: I just tried the example project from the bug report and it does compile fine with the Swift 5 compiler & ASan.