To/from memory-mapped file?

On Apple platforms you have to be very careful with memory mapping. If the targeted file is stored on a volume that can ‘go away’ — for example, on an external drive or a network volume — then memory mapping isn’t safe because accessing a file that’s gone away triggers a machine exception [1], which isn’t something you can reasonably handle [2].

Still, if you want to go down this path there are two standard APIs for mapping a file:

  • mmap — See the mmap man page.

  • Data.init(contentsOf:options:) — Using either .mappedIfSafe or .alwaysMapped option.

You go the other way using your file system API of your choice: write, fwrite, System framework, FileHandle, and so on.


IMO memory mapping is overused. I see two primary motivations here:

  • Unix Lore™ is that memory mapping is the only way to do no-copy reads and writes. That’s never been true on Apple platforms, and hasn’t be true on most Unix-y platforms for decades.

  • Folks think it’ll be convenient because they can define a language structure that maps to their data structure and then they’re just reading and writing fields rather than doing I/O. That rarely works out as well as you might hope [3].

And memory mapping has a lot of drawbacks:

  • The big one is the safety issue I’ve discussed above.

  • If the file is large, you have to worry about address space issues on iOS.

  • And on 32-bit platforms [4].

  • If you’re streaming through a large file you end up running all your I/O through the buffer cache, which is much less efficient than doing no-copy reads and writes.

  • If you’re accessing the file at random, those accesses put pressure on the VM system which cause other pages to get evicted, which might result in your I/O being fast but the rest of the system suffering.

Which isn’t to say that this approach is always wrong, just that folks ofter try to use it when it’s not appropriate.

Share and Enjoy

Quinn “The Eskimo!” @ DTS @ Apple

[1] Using the terminology defined here.

[2] I talk about that problem a lot in Implementing Your Own Crash Reporter.

[3] This technique has serious problems:

  • It only works if you defined the language structure in C, because Swift doesn’t give you a way to define the layout of its structures.

  • Even then, C’s structure layout in more of a shared hallucination than something defined by the standard.

  • And you still have to deal with alignment issues.

  • And byte ordering.

[4] In the Apple ecosystem there is one remaining 32-bit platform, watchOS.

12 Likes