[Pitch] UUID v7, other improvements

Hi all,

I’ve prepared a proposal and implementation for some commonly requested improvements to UUID. Please find the complete proposal here, along with a draft implementation.


UUID Version Support and Other Enhancements

Introduction

Foundation's UUID type currently generates only version 4 (random) UUIDs. RFC 9562 defines several UUID versions, each suited to different use cases. This proposal adds support for creating UUIDs of version 7 (time-ordered) which has become widely adopted for database keys and distributed systems due to its monotonically increasing, sortable nature.

In addition, UUID is in need of a few more additions for modern usage, including support for lowercase strings, access to the bytes using Span, and accessors for the commonly used nil and max sentinel values.

Motivation

UUID version 4 (random) is a good general-purpose identifier, but its randomness makes it poorly suited as a database primary key — inserts into B-tree indexes are scattered across the keyspace, leading to poor cache locality and increased write amplification. UUID version 7 addresses this by encoding a Unix timestamp in the most significant 48 bits, producing UUIDs that are monotonically increasing over time while retaining sufficient randomness for uniqueness.

Today, developers who need time-ordered UUIDs usually construct the bytes manually using UUID(uuid:), which is error-prone and requires understanding the RFC 9562 bit layout, or depend on another library. Foundation should provide a straightforward way to create version 7 UUIDs, and a general mechanism for version introspection that accommodates other UUID versions, even if we do not generate them in UUID itself.

Proposed solution

Add a UUID.Version struct representing the well-known UUID versions from RFC 9562, a version property on UUID for introspection, a static factory method for creating version 7 UUIDs, and convenience properties for the nil and max UUIDs.

// Create a time-ordered UUID
let id = UUID.timeOrdered()

// Inspect the version of any UUID
switch id.version {
case .timeOrdered:
    print("v7 UUID, sortable by creation time")
case .random:
    print("v4 UUID")
default:
    print("other version")
}

// The existing init() continues to create version 4 UUIDs
let randomID = UUID()
assert(randomID.version == .random)

// Nil and max UUIDs for sentinel values
let nilID = UUID.nil   // 00000000-0000-0000-0000-000000000000
let maxID = UUID.max   // FFFFFFFF-FFFF-FFFF-FFFF-FFFFFFFFFFFF

// Access the raw bytes without copying
let uuid = UUID()
let span: Span<UInt8> = uuid.span      // 16-element typed span

Detailed design

Nil and Max UUIDs

@available(FoundationPreview 6.4, *)
extension UUID {
    /// The nil UUID, where all 128 bits are set to zero, as defined by
    /// RFC 9562 Section 5.9. Can be used to represent the absence of a
    /// UUID value.
    public static let `nil`: UUID

    /// The max UUID, where all 128 bits are set to one, as defined by
    /// RFC 9562 Section 5.10. Can be used as a sentinel value, for example
    /// to represent "the largest possible UUID" in a sorted range.
    public static let max: UUID
}

The nil UUID (00000000-0000-0000-0000-000000000000) and max UUID (FFFFFFFF-FFFF-FFFF-FFFF-FFFFFFFFFFFF) are special forms defined by RFC 9562. They are useful as sentinel values — for example, representing "no UUID" or defining the bounds of a UUID range. Note that neither the nil UUID nor the max UUID has a meaningful version or variant field; the version property returns Version(rawValue: 0) and Version(rawValue: 15) respectively.

Lowercase string representation

@available(FoundationPreview 6.4, *)
extension UUID {
    /// Returns a lowercase string created from the UUID, such as
    /// "e621e1f8-c36c-495a-93fc-0c247a3e6e5f".
    public var uuidStringLower: String { get }
}

The existing uuidString property returns an uppercase representation. Many systems — including web APIs, databases, and URN formatting (RFC 4122 §3) — conventionally use lowercase UUIDs. uuidStringLower avoids the need to call uuidString.lowercased(), which allocates an intermediate String.

span property

@available(FoundationPreview 6.4, *)
extension UUID {
    /// A `Span<UInt8>` view of the UUID's 16 bytes.
    public var span: Span<UInt8> { get }
}

This property provides zero-copy, bounds-checked access to the UUID's bytes without the need for withUnsafeBytes or tuple element access. The returned Span<UInt8> is lifetime-dependent on the UUID value.

Initializing from a Span

@available(FoundationPreview 6.4, *)
extension UUID {
    /// Creates a UUID by copying exactly 16 bytes from a `Span<UInt8>`.
    public init(copying span: Span<UInt8>)
}

This initializer copies the bytes from a Span<UInt8> into a new UUID. The span must contain exactly 16 bytes; otherwise, the initializer traps.

Initializing from an OutputSpan

@available(FoundationPreview 6.4, *)
extension UUID {
    /// Creates a UUID by filling its 16 bytes using a closure that
    /// writes into an `OutputSpan<UInt8>`.
    ///
    /// The closure must write exactly 16 bytes into the output span.
    public init<E: Error>(
        initializingWith initializer: (inout OutputSpan<UInt8>) throws(E) -> ()
    ) throws(E)
}

This initializer provides a safe, typed-throw-compatible way to construct a UUID from raw bytes without going through uuid_t:

let uuid = UUID { output in
    output.append(timestampBytes)
    output.append(randomBytes)
}

The closure receives an OutputSpan<UInt8> backed by the UUID's 16-byte storage. If the closure writes fewer or more than 16 bytes, the initializer traps. If the closure throws, the error is propagated with its original type.

UUID.Version

@available(FoundationPreview 6.4, *)
extension UUID {
    /// The version of a UUID, as defined by RFC 9562.
    public struct Version: Sendable, Hashable, Codable, RawRepresentable {
        public let rawValue: UInt8
        public init(rawValue: UInt8)

        /// Version 1: Gregorian time-based UUID with node identifier.
        public static var timeBased: Version { get }

        /// Version 3: Name-based UUID using MD5 hashing.
        public static var nameBasedMD5: Version { get }

        /// Version 4: Random UUID.
        public static var random: Version { get }

        /// Version 5: Name-based UUID using SHA-1 hashing.
        public static var nameBasedSHA1: Version { get }

        /// Version 6: Reordered Gregorian time-based UUID.
        public static var reorderedTimeBased: Version { get }

        /// Version 7: Unix Epoch time-based UUID with random bits.
        public static var timeOrdered: Version { get }

        /// Version 8: Custom UUID with user-defined layout.
        public static var custom: Version { get }
    }
}

The version value is encoded in bits 48–51 of the UUID (the high nibble of byte 6), per RFC 9562. Version is a RawRepresentable struct rather than an enum, allowing new versions to be added without breaking source or binary compatibility. The well-known versions from RFC 9562 are provided as static properties. Versions 2 (DCE Security), 0 (nil UUID), and 15 (max UUID) do not have static properties but can be represented using Version(rawValue:) if needed.

version property

@available(FoundationPreview 6.4, *)
extension UUID {
    /// The version of this UUID, derived from the version bits
    /// (bits 48–51) as defined by RFC 9562.
    public var version: UUID.Version {
        get
    }
}

Creating version 7 UUIDs

@available(FoundationPreview 6.4, *)
extension UUID {
    /// Creates a new UUID with RFC 9562 version 7 layout: a Unix
    /// timestamp in milliseconds in the most significant 48 bits,
    /// followed by random bits. The variant and version fields are
    /// set per the RFC.
    ///
    /// Version 7 UUIDs sort in approximate chronological order
    /// when compared using the standard `<` operator, making them
    /// well-suited as database primary keys. UUIDs created within
    /// the same millisecond are distinguished by random bits and
    /// may not reflect exact creation order.
    public static func timeOrdered() -> UUID

    /// Creates a new UUID with RFC 9562 version 7 layout using
    /// the specified random number generator for the random bits.
    ///
    /// - Parameter generator: The random number generator to use
    ///   when creating the random portions of the UUID.
    /// - Returns: A version 7 UUID.
    public static func timeOrdered(
        using generator: inout some RandomNumberGenerator
    ) -> UUID
}

The resulting UUID contains a millisecond-precision Unix timestamp in bits 0–47, with version and variant fields set per RFC 9562. The remaining bits are filled using the system random number generator (for timeOrdered()) or the provided generator (for timeOrdered(using:)). The timeOrdered() convenience delegates to timeOrdered(using:) with a SystemRandomNumberGenerator.

Extracting the timestamp

@available(FoundationPreview 6.4, *)
extension UUID {
    /// For version 7 UUIDs, returns the `Date` encoded in the
    /// most significant 48 bits. Returns `nil` for all other versions.
    /// The returned date has millisecond precision, as specified
    /// by RFC 9562.
    public var timeOrderedTimestamp: Date? {
        get
    }
}

Source compatibility

This proposal is purely additive. The existing UUID() initializer continues to create version 4 random UUIDs. The random(using:) static method is unaffected. No existing behavior changes.

UUIDs created by timeOrdered() are fully valid UUIDs and interoperate with all existing APIs that accept UUID or NSUUID, including Codable, Comparable, bridging, and string serialization.

Implications on adoption

This feature can be freely adopted and un-adopted in source code with no deployment constraints and without affecting source compatibility.

Future directions

  • Version 5 (name-based SHA-1): A factory method like UUID.nameBased(name:namespace:) could be added in a future proposal for deterministic UUID generation.
  • Version 8 (custom): Could be exposed via an initializer that accepts the custom data bits while setting the version and variant fields automatically. For now, we do provide an initializer that allows for setting all of the bytes directly via OutputSpan.

Alternatives considered

Adding version as a parameter to init()

Instead of UUID.timeOrdered(), we considered UUID(version: .timeOrdered). However, different versions require different parameters — version 5 needs a name and namespace, version 8 needs custom data — so a single initializer would either need to accept many optional parameters or use an associated-value enum. Static factory methods are clearer and allow each version to have its own natural parameter list.

Using an enum for Version

We considered making Version an enum with a UInt8 raw value. However, a struct with RawRepresentable conformance allows new versions to be added in the future without breaking source or binary compatibility. Since the UUID version field is only 4 bits, the full space of 16 values is defined by the RFC, but using a struct is more consistent with Foundation's conventions for open sets of values (e.g., NSNotificationName, RunLoop.Mode) and avoids the need for an unknown case or optional return from the version property.

Supporting all UUID versions immediately

We considered adding factory methods for all versions (1, 3, 5, 6, 7, 8), but the immediate need is version 7. Version 1 (time-based with MAC address) has privacy implications. Versions 3 and 5 require different parameters. Version 6 is a reordering of version 1 and shares its concerns. Version 8 is intentionally application-defined. Starting with version 7 keeps the proposal focused while the Version struct provides the foundation to add others incrementally.

35 Likes

Nits and questions:

Do we need a non-standard shortening of "lowercase(d)" here? uuidLowercase(d)String—or uuidStringLowercased if alphabetically listing it next to uuidString is absolutely essential—both read more fluently as per API naming guidelines.


On the surface, it makes sense to have this API as the dual for the span property, but it's copying out of the span—which seems like an operation that should be feasibly composable from others and maybe not so essential?

This also leads me to wonder:

  1. whether it makes sense and is possible to have mutableSpan for in-place editing;
  2. whether, assuming we do want to keep this copying initializer, it ought to take raw bytes (RawSpan), since the element type here is really of no consequence and taking a RawSpan would allow data from strictly more spans to be passed along as the argument without memory rebinding or copying;
  3. whether it makes sense to have an init(_: [16 of UInt8]) initializer which, of course, would not have to trap at runtime—and maybe even a corresponding property.

As an appreciator of UUID, but a casual one, these seem non-intuitive to me.

I think anyone can understand and see that v1, v2, v3, etc. are different from each other. Without having to think twice, I know the last thing I vibe coded uses v4 UUIDs.

But timeBased versus reorderedTimeBased versus timeOrdered? It does not seem by my (admittedly shallow) reading that this exact verbiage has been fixed as term-of-art. By which I mean, if you approached an UUID enthusiast and asked: "Which versions of UUID are referred to by timeBased and timeOrdered, respectively?", it seems they'd have to think about it and/or consult the documentation. (There's nothing wrong in general with consulting documentation, of course; the point here being, though, that perhaps this consultation to map names to their meanings can be improved upon.)

Even the doc comment here describes both v1 and v7 as "time-based," and the initializer to create a new v7 UUID carries a doc comment which headlines that timeOrdered() means v7 (as does, notably, the title of this pitch).

Is there a reason not to call v7, v7?

[Additionally, if we're concerned about autocomplete lists when it comes to uuidStringLower(cased), I'd point out here that the naming proposed would have this autocomplete list show v1-v8 in a rather jumbled order.]

12 Likes

If you're adding nil and max, how would you feel about adding an init(_ intValue:), like what's in swift-dependencies.

extension UUID {
  /// Initializes a UUID from an integer by converting it to hex and padding it with 0's.
  ///
  /// For example:
  ///
  /// ```swift
  /// UUID(16) == UUID(uuidString: "00000000-0000-0000-0000-000000000010")
  /// ```
  ///
  /// If a negative number is passed to this function then it is inverted and the negative sign
  /// is encoded into the 16th bit of the UUID:
  ///
  /// ```swift
  /// UUID(-16) == UUID(uuidString: "00000000-0000-0001-0000-000000000010")
  ///                                            👆
  /// ```
  public init(_ intValue: Int) {
    let isNegative = intValue < 0
    let intValue = isNegative ? -intValue : intValue
    var hexString = String(format: "%016llx", intValue)
    hexString.insert("-", at: hexString.index(hexString.startIndex, offsetBy: 4))
    self.init(uuidString: "00000000-0000-000\(isNegative ? "1" : "0")-\(hexString)")!
  }
}

This allows for easy creation of deterministic values for testing.

10 Likes

Top line: love it. These are much needed updates. And v7 support is increasingly critical.


A few questions though:

The python version of UUID.timeOrdered() - uuid7 - guarantees monotonicity within a millisecond (they use a 42-but counter). Do you have any thoughts on whether this would be useful functionality or whether it is prohibitively costly, e.g. due to threading?

I second the suggestion to use .v1, .v2, etc. over these custom names for the Version struct - I think it’s clearer for the type of user who would want to know this info and can have a doc comment for the merely curious. For example Python’s UUID module (docs) uses the name uuid4() to generate a random UUID.

I like UUID.timeOrdered() as the name for the generation function. I know we don’t review the doc comments exactly, but I’d like to say I really appreciate the clarity on the parameter-less version & would hope that would be copied over to the one that takes an RNG.

These are not RFC-valid UUIDs. I think having to force these through an explicit "from bytes" step is a good thing.

2 Likes

What does the version accessor produce if I make a UUID with a non-standard variant?

I appreciate the intent of the names spelling out "how they work", but I think for most people's usage, names that reference the standards would be more useful:

UUID.v7(...)
uuid.rfc4122String

(Potentially both sets of names could coexist, but if only one should exist, my preference is for the ones that reference the specifications)

7 Likes

I agree with @xwu that this reads a little awkwardly. How about lowercasedUUIDString? You could also add a new StringRepresentation type so that users could do uuid.string(.lowercase).

Why not a RawSpan instead?

Did you consider adding an optional/defaulted Date parameter to the timeOrdered factory methods to allow users to pick which date they want to encode?

1 Like

Thank you for pitching this much-needed improvement to UUID!

Some things I really appreciate about this proposal:

  • It maintains backward compatibility.
  • It focuses on the most-needed improvement, UUID v7, while establishing a pattern for supporting other UUID versions.
  • Adding easy references to the upper and lower/nil bounds.
  • Adding a version property.
  • Having a factory method rather than overriding the initializer; I think your reasons for doing so are valid.

I agree with some of the feedback that has been presented so far, however:

  • Although slightly more verbose, I think that uuidStringLowercased would be more consistent with existing nomenclature in Swift than uuidStringLower.
  • I would also prefer using version numbers — v4, v7, etc. — over the more descriptive identifiers used here like timeOrdered. Version numbers seem to be more standard in other contexts.
2 Likes

Thanks for all the insightful questions. I'll try to address them together.

  1. uuidStringLower vs something else. I think I like lowercasedUUIDString, although it doesn't match uuidString in the case of UUID, it's probably the best alternative. I'll rename it.

  2. Version numbers as the struct static vars vs names. The rawValue of the struct is the version number (this is not really explained well in the proposal, admittedly), so we sort of have both options available. myUUID.version.rawValue == 7 would work, for example. That doesn't do much for the initializer, but v7() is a very strange function name and I didn't particularly like it. I don't think the UUID version numbers are really "ordered" either, so the autocomplete ordering of them, in my opinion, is not critical. I'm open to other ideas here.

  3. @willft Guaranteeing monotonicity - as you note, this would require global state, which usually has negative consequences for testing, predictability, etc. I'm unsure if this is worth the tradeoff, but would love to hear other community opinions on the value of this.

  4. @jrose - version simply reads the value of the version half-byte. You'll wind up with a Version with whatever integer value is there.

  5. mutableSpan - I think this is fine, but also see below. @Jon_Shier I think this would be sufficient for your testing purposes too, if the existing initializer wasn't.

  6. InlineArray initializer. This is inherently a copy, so I think I would prefer to just provide copying span. You can easily get a Span<UInt8> from an InlineArray<16, UInt8>, and - importantly, you can also get the span from whatever other storage you have lying around without copying it into an InlineArray first. You'll just have to slice it to make sure it's 16 bytes.

  7. Date argument for version 7 UUID creation. This seems plausible to add - I'm curious what the use case is for this? Testing purposes?

  8. RawSpan vs Span<UInt8>. This requires a bit more explanation.

I worked with @jmschonfeld on this before posting, and ultimately we came to the decision that there doesn't actually appear to be one correct answer here. I leaned towards Span<UInt8> because the implementation actually stored a tuple of UInt8 anyway, and most of its operations were most naturally expressed byte-wise. When I switched the implementation to an InlineArray<16, UInt8>, we again faced the reality that the storage is byte-wise and not raw.

We can't ignore, though, that things like version actually operate on half a byte, and the Date is 48 bits. The generation is probably most naturally expressed on a random 128 bit number instead of each byte, or even two 64 bit values. Since those are mostly implementation details, I felt like the most natural external representation was still bytes.

Tying this into the question about mutableSpan - that seems reasonable to add. However, if that were raw as well, then the span is sort of difficult to work with in a way in which I think most people would use it, which is byte-wise.

1 Like

After thinking about this a bit, it makes sense to just allow providing a Date in the factory method that also allows providing a RandomNumberGenerator. I've updated everything to include it.

    /// Creates a new UUID with RFC 9562 version 7 layout using
    /// the specified random number generator for the random bits.
    ///
    /// - Parameter generator: The random number generator to use
    ///   when creating the random portions of the UUID.
    /// - Parameter date: The date to encode in the timestamp field.
    ///   If `nil`, the current date is used.
    /// - Returns: A version 7 UUID.
    public static func timeOrdered(
        using generator: inout some RandomNumberGenerator,
        at date: Date? = nil
    ) -> UUID
3 Likes

Or maybe inject a Clock, mirroring injecting the RNG provider?

1 Like

I recommend using the most advanced and well-thought-out UUIDv7 implementation in PostgreSQL as a model. The timestamp offset parameter can be in milliseconds.

PostgreSQL Gains a Built-in UUIDv7 Generation Function for Primary Keys

commitdiff

Documentation

I could certainly see why nil name was used here (as per the RFC), however in Swift that difference between nil and .nil would be very subtle:

func foo(_ v: UUID?) { /*...*/ }
foo(nil)
foo(.nil)

consider some other names like nilUUID (along with maxUUID for consistency).
Or .zero /.min instead of .nil.

A question: if I create an UUID of a particular version, then convert that to a string and then init an UUID with that string – will the version be preserved?

Otherwise +1

6 Likes

By convention, static accessors for fixed instances of a type should generally not repeat the name of type (or a stem of it) in the property name. For example, the Objective-C method -NSFileManager defaultManager becomes FileManager.default in Swift, and so forth. That rules out nilUUID/maxUUID.

.zero seems like a perfectly fine choice here, and preferable to .nil.

9 Likes

Came here to say that .zero, .min and .null are all better names than .nil

8 Likes

It's getting a little long though. Why not just lowercasedString? What other string would it produce than one representing the UUID? The existing uuidString could be aliased as uppercasedString for symmetry.

(Similarly, URL has absoluteString and relativeString. Those aren't about case conversions, but they follow the same pattern of adjective + noun, without repeating the type's name, even if they clearly produce what someone might store in a variable called urlString.)

3 Likes

If we are going to deviate from the spec, then I think I prefer min to nil or zero, since at least it mirrors the max value.

7 Likes

Nit: should it be zeros, or is it like a SIMD type?

But, in any case, I do think there's virtue in sticking to the specification's term even if it clashes a bit with other usage, so long as in actual practice at the call site usage would not be actively harmful. On that note...

I want to push back on this point. Yes, it's strange: I don't think anyone is going to argue that "UUID v7" is the paragon of good naming. But, as I recall, we have precedent here in terms of version numbers prefixed by v in SwiftPM APIs.

More crucially, though, as others have pointed out, some variation of "UUID v7" is simply what this feature is known as in specifications and implementing ecosystems that exist outside our ambit. The straightforward spelling one would expect for a UUID v7 initializer would be something like UUID(v7:) or, if a static function, UUID.v7().

Sure, v7 is kind of odd-looking. But so is, for example, iOS. But that's just what they're named. We wouldn't dream of spelling iOS specifically in Swift as OperatingSystemForAppleBrandedMobileDevice.

We've contended with this issue elsewhere: Consider, for example that Unicode.Scalar uses 'scalar' in a distinct way from SIMDScalar; or that the IEEE floating-point maximum operation is distinct from Swift.max. If we were creating these specifications, obviously we would not choose clashing terminology like this. But when explicitly adopting an external specification, our practice has been (rightly, IMO) to give specified entities the names they've been specified with unless there is some clear, active harm.

6 Likes

For testing purposes, sure. I don’t have a concrete use case in mind, but it seemed surprising to me that we allow you to read a Date value out of the UUID but not choose which date is stored. I’d definitely like to hear from someone who has used UUIDv7 in practice about if they ever care about choosing a custom date.