Memory leaking in Vapor app

johannesweiss · March 29, 2019, 6:24pm

So first of all, it's very important that we're not blaming anybody. Every participant here was very reasonable and also we don't know the final outcome yet. And thanks very much to the OP to post such a succinct reproduction, that was/is invaluable.

tl;dr

It's (probably) not a leak but there's too much memory allocated. I say probably because I could not see a leak with the demo program from the post but I can't say for sure that the OP's real production application doesn't have a leak.

long story

Okay, I now had a chance to look into this and at least the reproduction that the OP posted does not leak for me. BUT it's very important to say that for the longest time I did think that it leaked. Let me go through why.

What I did first is to use valgrind on this app, then run the Scripts/run_test.rb and after the script ran through, I pressed Ctrl+C and valgrind presented me with:

==19319== 2,478,760 bytes in 9,995 blocks are definitely lost in loss record 1,174 of 1,177
==19319==    at 0x4C2DE56: malloc (vg_replace_malloc.c:299)
==19319==    by 0x5202D51: swift_slowAlloc (in /usr/lib/swift/linux/libswiftCore.so)
==19319==    by 0x5202DBE: _swift_allocObject_(swift::TargetHeapMetadata<swift::InProcess> const*, unsigned long, unsigned long) (in /usr/lib/swift/linux/libswiftCore.so)
==19319==    by 0x50ED6FC: Swift._StringGuts.reserveUnusedCapacity(_: Swift.Int, ascii: Swift.Bool) -> () (in /usr/lib/swift/linux/libswiftCore.so)
==19319==    by 0x5D57092: generic specialization <[Swift.Character]> of Swift.String.append<A where A: Swift.Sequence, A.Element == Swift.Character>(contentsOf: A) -> () (in /usr/lib/swift/linux/libFoundation.so)
==19319==    by 0x5D54ECE: Foundation.NSData.base64EncodedString(options: Foundation.NSData.Base64EncodingOptions) -> Swift.String (in /usr/lib/swift/linux/libFoundation.so)
==19319==    by 0x5FF0FC5: partial apply forwarder for closure #1 (Foundation.NSData) -> Swift.String in Foundation.Data.base64EncodedString(options: Foundation.NSData.Base64EncodingOptions) -> Swift.String (in /usr/lib/swift/linux/libFoundation.so)
==19319==    by 0x5FEDB04: function signature specialization <Arg[0] = Exploded> of function signature specialization <Arg[1] = [Closure Propagated : reabstraction thunk helper from @callee_guaranteed (@guaranteed Foundation.NSData) -> (@unowned Foundation._NSRange, @error @owned Swift.Error) to @escaping @callee_guaranteed (@guaranteed Foundation.NSData) -> (@out Foundation._NSRange, @error @owned Swift.Error), Argument Types : [@callee_guaranteed (@guaranteed Foundation.NSData) -> (@unowned Foundation._NSRange, @error @owned Swift.Error)]> of generic specialization <Foundation._NSRange> of Foundation._DataStorage.withInteriorPointerReference<A>(Swift.Range<Swift.Int>, (Foundation.NSData) throws -> A) throws -> A (in /usr/lib/swift/linux/libFoundation.so)
==19319==    by 0x5FDF1A9: Foundation.Data.base64EncodedString(options: Foundation.NSData.Base64EncodingOptions) -> Swift.String (in /usr/lib/swift/linux/libFoundation.so)
==19319==    by 0x606CE37: Foundation.(_JSONEncoder in _12768CA107A31EF2DCE034FD75B541C9).box_<A where A: Swift.Encodable>(A) throws -> Foundation.NSObject? (in /usr/lib/swift/linux/libFoundation.so)
==19319==    by 0x6072B17: Foundation.(_JSONEncoder in _12768CA107A31EF2DCE034FD75B541C9).encode<A where A: Swift.Encodable>(A) throws -> () (in /usr/lib/swift/linux/libFoundation.so)
==19319==    by 0x607367F: protocol witness for Swift.SingleValueEncodingContainer.encode<A where A1: Swift.Encodable>(A1) throws -> () in conformance Foundation.(_JSONEncoder in _12768CA107A31EF2DCE034FD75B541C9) : Swift.SingleValueEncodingContainer in Foundation (in /usr/lib/swift/linux/libFoundation.so)

so it says a couple of Strings have been definitely lost. So I believe that and dug a big further. Also almost all BinaryData objects appeared to have leaked.

A little later, I noticed that valgrind was still running at 100% CPU for almost a minute after Scripts/run_test.rb stopped. When I waited that minute until valgrind was back to 0% CPU and pressed Ctrl+C it actually reported no leaks anymore!

So why is that? In the original post, you can see an incredible saw tooth pattern, so the app consumes a massive amount of memory to then go back to idle. valgrind makes the execution so very slow, that when I was pressing Ctrl+C I was still kind of at the top of the sawtooth. In a C program, valgrind would now have reported 'still reachable' but in Swift it's possible that valgrind reports definitely lost but in reality that's not true. Why? Because the Swift compiler is very clever and it uses the spare bits of pointers to store things like tags enum for enum values. So when valgrind tries to find a reference for say pointer value 0xdeadbe00 it won't find because through the clever packing of enum tags in the spare bits of the pointers, the pointer might be saved in memory as 0xdeadbeef (here the extra 0xef could be an enum tag). It's also interesting that it shows a String as leaked because String makes a lot of use of clever bit-packing which is pretty awesome as it gives us the great performance. Unfortunately it also confuses leak checkers that just see if a certain pointer value is present anywhere else in memory (which would form a pointer to it).

Right, I confirmed a couple of times that when I waited enough time, valgrind was always confirming 'no leaks'.

Okay, so I ditched valgrind and looked at the raw memory stats. This is from the very beginning of the program:

USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root      3412  0.1  0.6 472092 26356 pts/0    Sl+  17:56   0:00 .build/x86_64-unknown-linux/release/Run

So we see about 26MB resident size. Let's run the Scripts/run_test.rb and let's check ps again:

root      3412  2.4 10.1 865308 408780 pts/0   Sl   17:56   0:02 .build/x86_64-unknown-linux/release/Run

BAD! Now we're having 408MB resident. Again, this looks like a massive leak! Let's run the Scripts/run_test.rb again:

and let's check `ps

root      3412  3.2 11.7 865308 473468 pts/0   Sl   17:56   0:04 .build/x86_64-unknown-linux/release/Run

now we're seeing 473MB, wow even more leaks, but oddly, it's less pronounced now... One more time:

root      3412  3.8 11.9 865308 481064 pts/0   Sl   17:56   0:06 .build/x86_64-unknown-linux/release/Run

481MB, ha the leaks are getting smaller. Because memory fragmentation is a thing, I decided to use mallinfo to let the Linux allocator tell me how many bytes are actually allocated rather than what kind of memory the application requested from the kernel.

So back to square one, let's print mallinfo's numbers every 10 s. At the start of the program it prints

mcheck_malloc_stats(arena: 1482752, ordblks: 524, smblks: 85, hblks: 0, hblkhd: 0, usmblks: 0, fsmblks: 4928, uordblks: 914544, fordblks: 568208, keepcost: 263888)

we can ignore most of these cryptic numbers but uordblks: 914544 tells us the number of bytes that are actually in use from the allocator. This is just under 1MB. Let's run Scripts/run_test.rb again:

After one iteration it prints

mcheck_malloc_stats(arena: 470839296, ordblks: 627, smblks: 119, hblks: 0, hblkhd: 0, usmblks: 0, fsmblks: 7264, uordblks: 1027664, fordblks: 469811632, keepcost: 263888)

so uordblks: 1027664, that's just over 1MB. So we did grow in memory, but only veeeery slightly. Let's run the script again:

mcheck_malloc_stats(arena: 471547904, ordblks: 629, smblks: 119, hblks: 0, hblkhd: 0, usmblks: 0, fsmblks: 7184, uordblks: 1027728, fordblks: 470520176, keepcost: 263888)

ha, we didn't grow a single byte anymore: uordblks: 1027728. Let's do it another 10 times:

mcheck_malloc_stats(arena: 471572480, ordblks: 629, smblks: 119, hblks: 0, hblkhd: 0, usmblks: 0, fsmblks: 7168, uordblks: 1027728, fordblks: 470544752, keepcost: 263888)

and again: not a single byte leaked.

So what's going on: Why is this process using 25MB in kernel memory at the beginning, then grows to almost 400MB and never goes smaller again, despite the fact that only 1MB of memory is actually allocated? It's memory fragmentation. The program the OP posts, loads 10,000 objects into memory at the same time. Due to (I think) an inefficiency in Vapor's mysql driver these 10,000 objects temporarily hold on to pretty big Data objects and ByteBuffers. But they all need to fit in memory at the same time, so the program has to request almost 500MB from the kernel to fit it in. Only a very few seconds later however, it'll release all the memory but due to fragmentation pretty much all the regions we have allocated from the kernel are still in use at least partially.

That's why we see the kernel memory growing slower and slower: We start with 25MB, the go to 400MB, then 450MB, then 470MB, then 480MB. At some point we just have allocated enough kernel memory to fit the next 10,000 objects into memory again.

How can we fix it?

So the problem is that too much memory is needed at the same, it's not leaked but that's still bad because our heap grows pretty big. How can this be resolved: There are two options:

the OP's program loads 10,000 objects into memory which can be, depending on the size of the objects, be a lot. So maybe that can be reduced?
I believe (I don't have proof for this and didn't check properly) that the mysql driver allocates pretty big ByteBuffer objects and I'm not sure if they're all needed that big.

So why do I believe (without proof) that the mysql driver might allocate objects that are too big? Our maximum memory with loading 10,000 objects into memory is about 500MB. That makes (roughly) 50k per object. And that seems way too large. I have no idea what's stored in the mysql database or what exactly these 10,000 objects represent but it seems unlikely that they should be 50k large . I'm very happy to be wrong here, if each of them is some kind of image, or large body of text, or something that all makes sense but it feels like a bit much.

How could this happen that one would by accident hold onto ByteBuffer/Data that holds lots of data in memory? It's actually quite simple. Imagine you'd write a database driver. So you communicate with the database and the database sends you back lots of data, but maybe you're only interested in a very small portion. For the sake of the argument, let's say the database responds with 1k of data but you're only interested in say 4 bytes of that. Now you might do

let data: Data = byteBuffer.readData(length: 4)

But the byteBuffer might contain the whole 1k response. And because we try not to copy the bytes over, the returned Data object will reference the whole 1k ByteBuffer, despite the fact it's only '4 bytes long'.

Clearly something's wrong here. In NIO, if you want to force a copy, you can do

let data = Data(byteBuffer.viewBytes(at: byteBuffer.readerIndex, length: 4))

but I think this is way too hard. NIO's getData/readData should do a bit better. We can't be perfect here because it's impossible to decide for sure if it would be better to copy the bytes or not. But NIO should definitely have better heuristics. If for example it's less than 1024 bytes or so, we should just copy. Also if the ByteBuffer is massively larger than what we want out of it, we should also copy instead of referencing the whole ByteBuffer. I filed a bug for this.

To get to the best possible result, NIO and Vapor's database drivers need to work together. If a copy is definitely better, NIO should probably do that automatically but when Vapor's driver knows that a copy would be sensible, it should also force that copy.

Sorry, I haven't proof-read this at all and I realise it's a massive wall of text. But I'm off for the weekend (without computer) so I thought it might be more useful to send my finding here than not.

conclusion

Please verify that you can confirm my findings. You can just do

DispatchQueue(label: "print").async {
    while true {
        let info = mallinfo()
        print(info)
        sleep(10)
    }
}

to print the information every 10 seconds. But mallinfo only works on Linux and you need to create a C module to expose it to Swift.

edits/notes

when I write “kernel memory” I mean memory provided by the kernel. The user-space allocator can request a memory region from the kernel and it can also return that memory to the kernel. The problem is that it can only return the whole region. So if it requested say 1MB and after a while is only using 10 bytes from that 1MB region, then it can not return that back. And ps will show that it needs the whole 1MB
the reason this only reproduced on Linux was twofold: 1) you’d only use valgrind for leak checking on Linux 2) the usual macOS tools all report (I think) the actually allocated size and not what has been handed out by the kernel
need to check if I can get LSan to work with Swift properly