How do you deal with mixed integer types?

AFAIK it's good Swift practice to use Int whenever possible.
Sometimes that doesn't make sense though.

For example Data uses Int as count and FileHandle uses UInt64 to seek an offset.
A negative count or position doesn't make sense but neither does Int.max in practice. Both are valid (or historic?) choices.

In my case I am storing page numbers, offsets, sizes, checksums. UInt32, UInt16 and UInt8 will do and anything bigger is just writing a bunch of UInt8.zero bytes to disk. No point in wasting so much of available storage space on .zero. In the aggregate though content can be bigger than the UInt16.max of an individual page.

In other words I'm mixing all these types either to compute something. The user is presented with a uniform facade of Int. I have a hunch I am not the only one in this situation.

My question is how do you deal with this in a nice way without giving up too much of Swift type safety system or without cussing it out? :wink:

I have thus far tried variations like A:

    func space() throws -> Int
    {
        try Int(sheet.move(to: format_free).read(type: Page.Offset.self))
    }

    func set(space: Int) throws
    {
        assert(space < Page.Offset.max)
        try sheet.move(to: format_free).write(value: Page.Offset(space))
    }   

B:

    func space() throws -> Int
    {
        try sheet.move(to: format_free).read(type: Page.Offset.self, as: Int.self)
    }

    func set(space: Int) throws
    {
        try sheet.move(to: format_free).write(value: space, as: Page.Offset.self)
    }

C:

    func space() throws -> Page.Offset
    {
        try sheet.move(to: format_free).read(type: Page.Offset.self)
    }

    func set(space: Page.Offset) throws
    {
        try sheet.move(to: format_free).write(value: space)
    }

which lets the call site deal with it. That last one litters every call site with:

let space = try Int(current.space())

or leads to ugly looking code like

let space = try current.space()
let foo = Int( UInt64(fileSize) / UInt64(pageSize))
assert(foo < UInt32.max)
let bar = foo * Int(number) + Int(...)
example.seek( UInt64(bar) ).write( UInt16(referenceBar))

and it's way too easy to mess up types.

It's driving me bonkers and I can't seem to settle down something that looks nice, is easy to read and type safe. Any suggestions or places I can look to for inspiration?

Please keep in mind that, in my case, the user can (maliciously) change the file contents including stored page offsets etc. The assumption that "writing valid offsets == reading valid offsets" is wrong even though the user never directly deals in offsets.

First off, Swift.numericCast(_:) is a good approach for conversion that is known to be safe. It basically tells the type inference system to deal with it.

Second, you should definitely use Int for numbers without any known bounds. I personally prefer to use UInt for numbers that are known to be non-negative, then the various explicitly-sized types only when I know they’re sufficient. That second part is very rare. It really seems like you should be using UInt for most of your example.

I don’t really see the problem with putting value-preserving initializers everywhere if you need them.

Oh, and one more tip: if you know that an operation cannot overflow (that is, it is impossible for the code to reach the operation if it were able to do so), make sure to use the wrapping operators.

Unless you are forced to do otherwise for the sake of protocol conformance, you should expose the actual type you end up with. If you want to hide that for some reason, I suppose you could return an opaque type. I’d advise against it.

1 Like

If you have to serialize numbers, then I agree you shouldn’t use Int or UInt: they vary between architectures. That being said, you should never convert numbers just because it looks simpler: it hides potential overflow issues and other problems. Don’t assume Int is Int32 or Int64 either: it can technically be anything.

By the way, Swift’s convention is to prefix anything that might produce undefined behavior with “unsafe”. Everything else is safe, even if only by virtue of a runtime crash. Your own code should follow that convention too, though you should avoid both if possible.

The reason this is relevant is that conversion between different integer types is by and large always safe: you don’t need to perform your own assertions, it’ll do that anyway. Same with overflow and integer operations: either it is guaranteed to check for it and respond by crashing (the “default”), or it is guaranteed to skip the check and wrap (the wrapping operators, which are therefore marginally faster).

That being said, don’t write any code that actually expects to fail an assertion or precondition. That’s a recipe for disaster, and also causes it to break if you want to disable the checks when compiling for extra efficiency.

Thank you for your replies.

Yes, these numbers are serialised. A max Int amount of bytes (UInt8) is spread over max UInt32 number of pages with max UInt16 fixed size. I guess this is one of those rare cases where the types are known to be sufficient.

Two bytes fit in UInt16 but might represent an overflow if the page size is only 4kb. Programming errors not even required. A runtime crash due to a corrupt file is not really nice behaviour.

The earlier example was maybe not wel chosen. Another point of friction for example is when the store is in-memory only. A [Page] array has Int count and indexing yet page numbers are UIn32.

My best path forward then is to use the smaller Int types, upscale them liberally to Int but check and throw when downscaling to smaller types.

Maybe a better conclusion is that mixing integer types is messy, better get used to it.

Edit:

That being said, don’t write any code that actually expects to fail an assertion or precondition

No, of course not. I’m differentiating between my programming errors (assert), other programmers (precondition) and flawed input (throw).

If throwing an error isn’t appropriate, there’s always Swift.fatalError(_:file:line:).

Is it possible for you to just never downscale? Note that Int and UInt are meant to be used as the “default” options. It’s literally impossible to rule out overflow for a fixed-size integer and any input, so the best option is to use them and assume (almost always correctly) it won’t be an issue. Worst-case scenario, the program safely crashes.

You’re sort of screwed, really. I personally hate the decision to use Int for count, and I’d rather it was changed in Swift 6. That seems unlikely, though.

You have three options:

  1. If you aren’t actually planning to run this on architectures less than 64-bit, you could just leave it. The compiler will ensure nothing dangerous happens as a result, and it will be extremely easy to identify the issue later. I wouldn’t worry about it: people usually don’t write code that will run on 8-bit architectures either.
  2. Create a structure (or tuple if it’s temporary) that holds two Arrays, and call them high and low. The standard library uses this technique for some operations. You can implement Sequence on it without a hitch (consider exposing the truncated cardinality with underestimatedCount, which defaults to the technically correct 0), then perform many operations that way.
  3. Drop down to pointers and manual memory management. I think this is probably overkill, but Swift does work with pointers really well even though you almost never need it.

No, because they have to fit in the allocated storage space. Otherwise finding anything within the file becomes difficult or slow.

Yeah, I think so too. :joy:

  1. 64 bit is the assumption.
  2. Interesting
  3. Yeah, was hoping to avoid that for the in-memory store at least.

Ok, I’ll review my code and see how it goes. Thanks for the discussion.

Don’t worry about it too much: watch this session and take it slow. You can still preserve a considerable amount of safety so long as you pay attention to the documentation.

What's your goal for this assert? Page.Offset(space) will fatalError if space is not representable, so the assert isn't adding any safety.

That’s why I went over the safety guarantees: the check is completely redundant. Unless you are using something unsafe, you can trust that the checks are already implemented inside the function.

For obvious reasons, you should be implementing your own assertions and preconditions that way too.

In that case, why did you upscale originally?

1 page contains at most UInt16.max bytes or 64kb. That’s not a lot. So content like an image is stored over multiple pages. Pages contain references to other pages through their page number (UInt32). Lots of arithmetic going on with these integer types.

Hmm, the snippets are work in progress and I got frustrated again with all the type mixing. So I thought to ask here. Sorry for the less than ideal examples. The underlying problem is valid though.

…What exactly are you trying to accomplish here? The only reason to not let the environment’s virtual memory handle this is if you are writing a virtual memory system. And where does arithmetic come into it?

A database written from scratch in pure swift using async/await. Currently working on saving a B-Tree to disk. Well actually working on the Write-Ahead Log to deal with power failures.

It’s a very instructive project. Yes, mature DB’s exist. Yes, I’m reinventing the wheel (don’t care). No, I don’t expect anybody to use mine. Who knows in 20 years, though. :joy: In the mean time I hope to learn and improve my skills through this project. Most likely outcome: this project dies silently when I stop learning from it. And that’s ok too.

Go nuts, there’s always room for another option.

That being said, I recommend abstracting things a bit more if you can. Make distinct modules, maybe use existing packages for lower-level things. In particular, you may want to take a look at core libraries like Atomics. Many of them are meant to be treated like part of the standard library, and they’re very performant.

If you really want to do it with the standard library alone, at least consider following a similar separation of concerns.

1 Like