Compile-time static string length validation similar to InlineArray?

I use swift together with network protocols where strings or array lengths can be limited.

In swift 6.2 the new InlineArray is introduced which is great for enforcing compile time checks for element count:

extension Message {
    mutating func setFixedSizeByteArray(_ value: InlineArray<4, UInt8>) {
        // ...
    }
}

message.setFixedSizeByteArray([0x01, 0x02, 0x03, 0x04])  // OK
// message.setFixedSizeByteArray([0x01, 0x02, 0x03, 0x04, 0x05]) // error: Expected '4' elements in inline array literal, but got '5'

I'd love to have somewhat similar for strings, e.g.:

    func setFixedSizeAccount(_ value: InlineString<4>) {
    }

message.setFixedSizeAccount("blindspot") // compile-time error: 9 chars instead of 4

I know that I could implement a custom macro that checks the length and outputs diagnostics..

But is there any standard or recommended way to achieve smth like this in swift 6.2?

One of the easiest ways you could do this would be doing this

typealias InlineUTF8String<let count: Int> = InlineArray<count, UInt8>

extension InlineArray: ExpressibleByStringLiteral, CustomStringConvertible where Element == UInt8 { ... }

[1][2]


  1. Altered from Character to UInt8 per @allevato ↩︎

  2. Removed StringProtocol conformance per @grynspan ↩︎

If we want to talk about compile-time static string lengths, that conversation must include what encoding you actually want. For example, an "inline string" of UTF-8 or ASCII code units would make a lot of sense for some low-level use cases.

A compile-time static string of Characters, on the other hand, is sort of nonsensical. Characters themselves can have varying sizes, and even the same Character (where "same" is defined in terms of Unicode equivalence) can be a different size depending on which normalization method you use. (Is "é" two or three bytes? It depends.)

Heck, even the length of a String in terms of Characters is a property of the Unicode tables used at runtime, so if the compiler did hypothetically try to verify some notion of "length in Characters" at compile time, there's no guarantee it would still hold if you run that binary on a platform with a different version of Unicode.

10 Likes

In our case it would be utf-8 or ascii really, depending on the target destination we’re working with.

I wouldn't do this. You'll probably have a bad day adding these (this many!) retroactive conformances, and adding StringProtocol in particular makes InlineArray conform to Collection which is a bad idea as discussed when InlineArray was pitched.

Edit: I mean, you shouldn't add any of these conformances. Retroactive conformances are not your friends. :frowning:

Not as simple as passing "hello" but you could probably use a macro to generate that fixed sized inline array literal from a string.

2 Likes

To answer your question directly: no.

2 Likes

Thank you everyone for the answers!

Right, I have only ascii/utf8 encodings and was thinking only about them. I think that is and will be the only need for my use-cases.

I was looking at this as one of possible methods for custom struct. However, even if I implement ExpressibleByStringLiteral for InlineArray or any custom struct or class I couldn't find better way rather than fail that at runtime:

struct InlineString<let count: Int>: ExpressibleByStringLiteral {
    init(stringLiteral value: StringLiteralType) {
        precondition(value.count <= count, "Expected at most \(count) characters in inline string literal, but got \(value.count)")
        // ...
    }
}

At some point I was thinking if I could use where constraint for count in string literal but that seems also available only on runtime. But would be fun if some const values could be a part of where statement like below:

protocol StringLiteralProtocol {
    associatedtype Element
    static var count: Int { get } // or `associatedValue let count: Int { get }`
}

struct InlineString<let count: Int>: ExpressibleByStringLiteral {
    init<StringLiteral: StringLiteralProtocol>(
        _ value: StringLiteral
    ) where StringLiteral.count <= Self.count,
            StringLiteral.Element: UTF8Char {
        // ...
    }
...

Using direct array literal is a one of approaches that works for internal use-cases and code-gens. Though that looks a bit scary when exposed as end-user API.
So, yeah, macro is probably the way to do it.

One of the approaches was to make a free-standing macro that would infer the size:

@freestanding(expression)
public macro fixedString<let count: Int>(_ string: String, size: Int = count) -> InlineArray<count, UInt8> =
    #externalMacro(module: "PluginTypeMacros", type: "FixedStringMacro")

message.setFixedSizeByteArray(#fixedString("123")) // cannot get `size` parameter from expression

Unfortunately, I could not extract nor default argument, nor return type which is a macro limitation as far as I understood from the other threads.
Instead explicit size works:

message.setFixedSizeByteArray(#fixedString("123", size: 4)) // padded string with zeros
message.setFixedSizeByteArray(#fixedString("12345", size: 4)) // error -> "Expected '4' elements in inline array literal, but got '5'"

Probably macro is a good compromise for current use-cases. However, if I could write some of generic constrains instead for custom structures, it would be much much nicer.

So, I wonder if there are any features planned in one of these directions?