Interestingly, in -Ounchecked, the assembly for foo is quite a bit longer. I would have expected it to be the same sans the comparison and jump. Is there a missed optimization opportunity in -Ounchecked, or is this just my very basic understanding of assembly shining through?
Similarly, the -Osize version does not inline, which sounds right at first, but the resulting foo is actually longer than in -O as well, so inlining would actually bring code-size benefits as well iiuc
This maybe a newbie question, but would you please explain why do you do the precondition this way?
Why do you add 7 first then divide the sum by 8? What is the problem with using (Self.bitWidth / 8) directly?
Is it valid and safe if I use bytes.count = MemoryLayout<Self>.size / MemoryLayout<UInt8>.size as precondition here?
Why do you add 7 first then divide the sum by 8? What is the problem with using (Self.bitWidth / 8) directly?
It's just me being extra-paranoid and making it work correctly for hypothetical integer types with bitWidth that is not a multiple of 8. No public standard library type would ever be in that boat, and it would be slightly weird for a custom type as well, but it is possible.
Is it valid and safe if I use bytes.count = MemoryLayout<Self>.size / MemoryLayout<UInt8>.size as precondition here?
I don't think that there's a guarantee that a type conforming to FixedWidthInteger doesn't have some additional fields other than its numerical value, which would make this do the wrong thing. This would be extremely odd, however.