[Pitch] `@OptionSet` macro

davedelong · March 6, 2023, 11:27pm

This will likely be an unpopular opinion and I'll probably take some flak for it, but for the record:

-1 We should not do this, because OptionSet is not a particularly good API and I don't believe we should be actively promoting its use by giving it privileged functionality.

A bit more rationale

Semantically, OptionSet is identical to a Set<SomeOption>. As a protocol it adds nothing to the SetAlgebra protocol except for the single distinction that it is also RawRepresentable. Or in other words, OptionSet exists specifically to expose the implementation detail that a bunch of values can be represented as a bitmask.

I would much rather see a new kind of protocol (BitMaskRepresentable) that can be adopted by a type and then specializations on Set where the Element: BitMaskRepresentable.

Then we would always be dealing with Set<Foo> values; we would be able to make meaningful distinctions between "a single value" versus "a set which contains a single value" (which is very unintuitive in OptionSet land); etc.

bbrk24 · March 7, 2023, 12:00am

I haven't been following the macro proposals too closely; would it be possible for it to emit something like this?

struct MySet: OptionSet {
    var rawValue: Int

#if arch(arm64) || arch(x86_64)
    static let big = Self(rawValue: 1 << 33)
#elseif arch(arm) || arch(i386)
#error("MySet.big is too big for 32-bit platforms")
#else
#error("Unrecognized architecture")
#endif
}

Joe_Groff · March 7, 2023, 12:01am

I can agree that there's a lot of C brain in OptionSet, and it's not necessarily the best clean-slate approach to defining small sets of flags, but I take this proposal to be an attempt at eliminating privileged functionality, using macros that in theory anyone could write to replace the special functionality OptionSet gets today.

Ben_Cohen · March 7, 2023, 12:29am

Why would you rather this? Why would this be good? (the performance implications of it would be significant, so the upside would also need to be)

tera · March 7, 2023, 1:25am

A slight variation of the alternative: let's consider the individual options get auto incrementing values, similar to how enumeration constants are getting them, just instead of 0, 1, 2... the sequence of values is 1, 2, 4, ...

optionset ShippingOptions: Int {
    case nextDay          // gets implicit 0x01 value
    case secondDay        // gets implicit 0x02 value
    case priority         // gets implicit 0x04 value
    case standard = 0x100 // explicit override
    case next             // gets implicit 0x200 value
    case crazy = 0x345    // rarely needed but we still can do this
    case after            // gets implicit 0x400 value
}

Edit:

In other words, when you customise a value you customise the actual "mask" value rather than the bit shift:

not:  case xxx = 5 // to get 1 << 5 mask value
but:  case xxx = 1 << 5 // right away

Obviously you'd then be able writing arbitrary expressions like the above "case crazy = 0x345" (which is rarely needed but still may be important).

ricketson · March 7, 2023, 4:33am

I like this!

If the code I would normally write is this:

struct ShippingOptions: OptionSet {
  let rawValue: Int
  
  static let nextDay = Self(rawValue: 1 << 0)
  static let secondDay = Self(rawValue: 1 << 1)
  static let priority = Self(rawValue: 1 << 2)
  static let standard = Self(rawValue: 1 << 3)
}

Then the macro directly reduces the boilerplate, letting me write a strict subset of the code I would normally have to write:

@OptionSet struct ShippingOptions {
  static let nextDay: Self
  static let secondDay: Self
  static let priority: Self
  static let standard: Self
}

This seems like a good general guideline for macros that share the same name as a protocol, like @OptionSet: the macro should help reduce the boilerplate of a typical, "happy path" conformance.

The current proposed solution does strictly reduce the boilerplate of the "private enum" style of implementing an OptionSet, but I'm not sure how common that particular style is. It seems better to me to spend the unique @OptionSet name on a macro that implements the conformance as directly as possible.

As some related prior art, SwiftUI has a few cases of concepts that come in both enum and option set forms, e.g. Edge and Edge.Set. In those cases, the enum is the outer type, with the option set declared as the inner .Set type.

The "enum wrapping an option set" pattern would be another alternative/future direction to consider for the proposal, in cases where a developer wants to create and expose both an enum and option set representation of the same concept, e.g.

@OptionSetRepresentable // <- straw man name
enum ShippingOption {
  case nextDay
  case secondDay
  case priority
  case standard
}

which would expand to a nested ShippingOption.Set option set type.

I like that pattern because it's generally more intuitive to declare and document an enum, so that's what I'd likely declare first and want to have as the top-level type in my API. It also mirrors the OptionSet name by using an <Option>.Set naming pattern.

(To be clear, this approach is complementary with the @OptionSet macro, not a mutually-exclusive alternative — if you only want an option set, and want to keep the enum representation private, then use @OptionSet.)

taylorswift · March 7, 2023, 5:57am

i am not sure if anyone has brought this up before, but i wonder if it would be possible to have some sort of “diagnostic macro”, that emits no code, but receives a view of a syntax tree at build-time and can emit compiler warnings if the expression is obviously wrong. i can see this being very valuable in the DSL domain, if we could do something like:

@Validate3ElementArrayLiteral
extension Vector3:ExpressibleByArrayLiteral
{
    ...
}

and users of the Vector3 library running a new enough compiler could get compile-time diagnostics for their Vector3 array literals.

anon9791410 · March 7, 2023, 6:07am

I don't understand your "explicit overrides", but

already works. But because of the need for manual forwarding, I don't think it's as good as my solution above.

To get rid of the manual forwarding, I don't think a special-case macro is a good idea. I understand people are excited to play with macros, but to improve OptionSet, they probably should be working on more generalized forwarding via static subscripts with dynamicMember instead.

struct ShippingOptions: OptionSet {
  private enum Option: BitFlag<RawValue> {
    case nextDay, secondDay
    // this bit is cursed, don't use it
    case priority = 3, standard
  }

  static let nextDay = Self(Option.nextDay)
  static let secondDay = Self(Option.secondDay)
  static let priority = Self(Option.priority)
  static let standard = Self(Option.standard)

  let rawValue: Int
}

public extension OptionSet {
  init(_ option: some RawRepresentable<some RawRepresentable<RawValue>>) {
    self.init(rawValue: option.rawValue.rawValue)
  }
}

BitFlag

/// A representation of a single bit "flag".
public struct BitFlag<RawValue: BinaryInteger & _ExpressibleByBuiltinIntegerLiteral> {
  public let rawValue: RawValue

  public init?(rawValue: RawValue) {
    guard
      rawValue != 0,
      rawValue & (rawValue - 1) == 0
    else { return nil }

    self.rawValue = rawValue
  }
}

// MARK: - Equatable
extension BitFlag: Equatable { }

// MARK: - ExpressibleByIntegerLiteral
extension BitFlag: ExpressibleByIntegerLiteral {
  public init(integerLiteral flagIndex: RawValue) {
    self.init(rawValue: 1 << flagIndex)!
  }
}

// MARK: - RawRepresentable
extension BitFlag: RawRepresentable { }

tera · March 7, 2023, 7:30am

Likewise I suppose there's no goal implementing the new "better option sets" necessarily using macros. And if there are slight syntax improvements possible with implementing them differently - those are quite worth considering.

option sets are somewhat "a league on their own". There are some similarities and differences compared to both enums and structs:

as enums they only have a single var (rawValue) while structs can have other variables.
as both enums and structs can have static variables and instance and static methods.
as enums they have a number of "cases" (could be emulated with struct's "static lets" with some limitations).
as enums perhaps could support AllCases (e.g. get the "all" field automatically), and perhaps even enumerate the cases.
could be emulated as a struct with a bunch of "var field: Bit" (not currently possible in swift but at least conceptually, in which case there's no limitation of max 64 bits)
interestingly enums imported from Obj-C (NS_ENUM) could hold arbitrary integer patterns which makes them very similar to what we are looking for for the option sets.

Just that in the below example the case called "standard" would get the value of 0x08, and it doesn't because we are explicitly overriding it with a different value:

optionset ShippingOptions: Int {
    case nextDay          // gets implicit 0x01 value
    case secondDay        // gets implicit 0x02 value
    case priority         // gets implicit 0x04 value
    case standard = 0x100 // explicit override
    case next             // gets implicit 0x200 value
    case crazy = 0x345    // rarely needed but we still can do this
    case after            // gets implicit 0x400 value
}

as with enums the next value after that (in this example called "next") restarts with the implicitly assigned "next" mask value of 0x200.

Douglas_Gregor · March 7, 2023, 7:38am

None of these make option sets special; it's just reducing a small amount of boilerplate. You mention NS_ENUM, but NS_OPTIONS is the relevant macro: it's imported as an option set, and is (annoyingly) a less-verbose way to create an option set than any way we could do it in Swift. The macros being discussed here can address that issue.

tera:

Just that in the below example the case called "standard" would get the value of 0x08, and it doesn't because we are explicitly overriding it with a different value:

optionset ShippingOptions: Int {
    case nextDay          // gets implicit 0x01 value
    case secondDay        // gets implicit 0x02 value
    case priority         // gets implicit 0x04 value
    case standard = 0x100 // explicit override
    case next             // gets implicit 0x200 value
    case crazy = 0x345    // rarely needed but we still can do this
    case after            // gets implicit 0x400 value
}

You can easily implement this behavior with the macro.

Doug

anon9791410 · March 7, 2023, 7:46am

That seems antithetical to this thread, to be in both 0b/0x land, and bit-flag-index land. 0x100 should be 8. (That works with my previous post.) 0x345 (0b1_0101_1001) is not an Option; it's five.

markuswntr · March 7, 2023, 7:56am

+1: this becoming a macro and on improving the ergonomics of OptionSet.

I personally find this suggestion easier to read and to understand then the pitched private enum Options: Int {} approach.

Summary

The latter one also does remind me of CodingKeys, where at first it was not immediately obvious (to me) whether or not the compiler would pick them up if they were defined private or in an extension.
You have to try it to see the result.

tera · March 7, 2023, 8:02am

I believe it could make a difference. For example you may have an AnyBitSet type, size of Int64, that you'll be able to cast back and forth easily to arbitrary "optionset" values with no boxing unboxing overhead (as this is just an int under the hood). But not so if you have a less restricted "struct" value:

@OptionSet
struct ShippingOptions {
    private enum Options: Int {
        case nextDay, secondDay, priority, standard
    }
    var someExtraField: Int
}

which may have some arbitrary extra fields in it.

Unless of course you are telling me that the "var someExtraField: Int" above would be a syntax error.

Douglas_Gregor · March 7, 2023, 8:09am

tera:

But not so if you have a less restricted "struct" value:
@OptionSet
struct ShippingOptions {
    private enum Options: Int {
        case nextDay, secondDay, priority, standard
    }
    var someExtraField: Int
}
which may have some arbitrary extra fields in it.

The OptionSet macro could detect the presence of this stored property and produce an error. It's a couple of lines of code in the macro implementation. Even if that were not true, the possibility that someone could write such a thing and get confused would not motivate the addition of a new feature.

I don't know how to be anything but blunt here: there is no path where Swift gets a new kind of nominal type for option sets. They do not, and will not ever, meet the criteria for addition into the language. If you want to continue this discussion, please do so in a separate thread, where I will not be participating.

Please let this thread focus on how best to use macros to eliminate boilerplate for such a case. If the end result is not good enough, not compelling enough, then we should reject it then. But it won't be in favor of a language feature.

Doug

swhitty · March 7, 2023, 10:19pm

+1 on using macros to synthesise options. As many have pointed out OptionSet brings pitfalls and synthesis removes much of the subtlety.

I am drawn to using static var for the options over a nested enum Options: Int because as mentioned the macro expansion diff is smaller which is a usability win and should not be understated.

It may take some time for me to get used to, but I much prefer when the code I am reading explicitly states the protocol conformance — generic constraints and protocol conformance already have a steep learning curve in Swift and using macros like @OptionSet makes this even steeper.

struct ShippingOptions: OptionSet { } // explicit and clear

@OptionSet
struct ShippingOptions { } // yes tooling can expand but this is an additional abstraction

I would be hesitant to support @OptionSet so quickly in this form to the standard library without some discussion on these forums about the feasibility of witness macros mentioned in the possible vision.

michelf · March 7, 2023, 11:00pm

For a project of mine, instead of using OptionSet I built a BitwiseSet type that can be used in this way:

enum Side: Int {
    case left, right, bottom, top
}
typealias Sides = BitwiseSet<Side>

// in usage:
let sides: Sides = [.left, .right]

I've been pretty happy with this.

In my opinion, this OptionSet macro feels more heavyweight than necessary (both in conceptual complexity and declaration syntax), and has an uglier declaration syntax. The equivalent option-set to the above would look like this:

@OptionSet
struct Sides {
    enum Options: Int {
        case left, right, bottom, top
    }
}

// in usage:
let sides: Sides = [.left, .right]

The only advantage of OptionSet I can see is that the type of the set is the same type as its value, allowing you to omit the [] when there's a single value:

let sides: Sides = .left // no [] only works with OptionSet

Although honestly I'm not sure if this is an advantage or a fault. (I wonder now if this could be imitated for BitwiseSet with some key path forwarding trickery.)

On the negative, with @OptionSet the enum has to be nested in the @OptionSet type and has a rigid name. Most of the time I'd rather use an external enum. I suppose an external enum could be "typealised" inside of the @OptionSet type (will the macro allow that?), but that's just more boilerplate.

BitwiseSet implementation


/// BitwiseSet is a set of values stored as a bit mask. Elements in the set
/// must be RawRepresentable with an Int as a RawValue. Typically, the element 
/// type is an enum based on Int:
///
///     enum State: Int {
///         case closed
///         case open
///         case mixed
///     }
///     var validStates: BitwiseSet<State> = [.open, .closed]
///
/// Since the storage for BitwiseSet is an Int, raw values of its element
/// must not exceed the number of bits in an Int. So on a 64-bit environment,
/// the valid range of raw values for its elements are `0 ..< 64`, and with 
/// 32-bit it is `0 ..< 32`.
public struct BitwiseSet<Element>: SetAlgebra, Sequence, RawRepresentable, Hashable where Element: RawRepresentable, Element: Equatable, Element.RawValue == Int {
	public var rawValue: Int
	public init() {
		rawValue = 0
	}
	public init(rawValue: Int) {
		self.rawValue = rawValue
	}

	/// Initialize the set with one optional element, resulting either in a
	/// 
	/// single-member set or the empty set (if `nil`).
	public init(_ member: Element?) {
		self.rawValue = BitwiseSet.rawValue(for: member)
	}

	public var first: Element? {
		return first(where: { _ in true })
	}

	public func intersects(_ other: BitwiseSet) -> Bool {
		return rawValue & other.rawValue != 0
	}
	public func union(_ other: BitwiseSet) -> BitwiseSet {
		return BitwiseSet(rawValue: rawValue | other.rawValue)
	}
	public func intersection(_ other: BitwiseSet) -> BitwiseSet {
		return BitwiseSet(rawValue: rawValue & other.rawValue)
	}
	public func symmetricDifference(_ other: BitwiseSet) -> BitwiseSet {
		return BitwiseSet(rawValue: rawValue ^ other.rawValue)
	}
	public mutating func formUnion(_ other: BitwiseSet) {
		rawValue |= other.rawValue
	}
	public mutating func formIntersection(_ other: BitwiseSet) {
		rawValue &= other.rawValue
	}
	public mutating func formSymmetricDifference(_ other: BitwiseSet) {
		rawValue ^= other.rawValue
	}

	public func contains(_ member: Element) -> Bool {
		return rawValue & BitwiseSet.rawValue(for: member) != 0
	}
	@discardableResult
	public mutating func insert(_ newMember: Element) -> (inserted: Bool, memberAfterInsert: Element) {
		let oldRawValue = rawValue
		rawValue |= BitwiseSet.rawValue(for: newMember)
		return (oldRawValue == rawValue, newMember)
	}
	@discardableResult
	public mutating func remove(_ member: Element) -> Element? {
		let oldRawValue = rawValue
		rawValue &= ~BitwiseSet.rawValue(for: member)
		return oldRawValue == rawValue ? nil : member
	}
	@discardableResult
	public mutating func update(with newMember: Element) -> Element? {
		let oldRawValue = rawValue
		rawValue |= BitwiseSet.rawValue(for: newMember)
		return oldRawValue == rawValue ? newMember : nil
	}

	/// The range of valid raw values for elements. This is determined by the 
	/// bit size of Int. On a 64-bit architecture, the range is `0 ..< 64`.
	public static var supportedRangeOfMemberRawValues: CountableRange<Int> {
		return 0 ..< MemoryLayout<RawValue>.size * 8
	}
	/// The raw value to use for storing the given element. This is
	/// simply `1 << member.rawValue`. `member.rawValue` must be inside of
	/// `supportedRangeOfMemberRawValues`.
	public static func rawValue(for member: Element?) -> Int {
		guard let member = member else { return 0 }
		assert(supportedRangeOfMemberRawValues.contains(member.rawValue), "BitwiseSet is limited to elements having a raw value in the range \(supportedRangeOfMemberRawValues) (dependent on the current architecture). Value \(member.rawValue) for \(member) is out of range.")
		return 1 << member.rawValue
	}

	/// The set that contains all possible elements values.
	/// - Note: Implemented by attempting to create an element with
	///         `init?(rawValue:)` from all supported raw values in
	///         `supportedRangeOfMemberRawValues`, adding non-nil elements into
	///         the set. This is not efficient if `init?(rawValue:)` is not.
	///         Also, there is no cache.
	public static var all: BitwiseSet {
		var all = BitwiseSet()
		for elementRawValue in BitwiseSet.supportedRangeOfMemberRawValues {
			if let element = Element(rawValue: elementRawValue) {
				all.insert(element)
			}
		}
		return all
	}

	// MARK: Sequence

	public struct Iterator: IteratorProtocol {
		fileprivate var index = 0
		fileprivate var remainingSet: BitwiseSet
		fileprivate init(_ set: BitwiseSet) { remainingSet = set }
		mutating public func next() -> Element? {
			while !remainingSet.isEmpty {
				defer { index += 1 }
				if let element = Element(rawValue: index), remainingSet.contains(element) {
					remainingSet.remove(element)
					return element
				}
			}
			return nil
		}
	}

	public func makeIterator() -> Iterator {
		return Iterator(self)
	}

}

extension BitwiseSet: CustomStringConvertible {

	public var description: String {
		return "[\(map { "\($0)" }.joined(separator: ", "))]"
	}

}

tera · March 7, 2023, 11:43pm

michelf:

For a project of mine, instead of using OptionSet I built a BitwiseSet type that can be used in this way:
enum Side: Int {
    case left, right, bottom, top
}
typealias Sides = BitwiseSet<Side>

// in usage:
let sides: Sides = [.left, .right]

I like this very much. Combining this with the idea of @ricketson above we may have:

public protocol HasOptionSet: Equatable, CaseIterable, RawRepresentable where RawValue == Int {
    typealias OptionSet = BitwiseSet<Self>
}

that'll get to this final usage:

enum Side: Int, HasOptionSet {
    case left, right, bottom, top
}

let sides: Side.OptionSet = [.left, .right]

but frankly that's not much better than the explicit:

let sides: BitwiseSet<Side> = [.left, .right]

michelf:

The only advantage of OptionSet I can see is that the type of the set is the same type as its value, allowing you to omit the [] when there's a single value:
let sides: Sides = .left // no [] only works with OptionSet
Although honestly I'm not sure if this is an advantage or a fault.

The fact that I can do:

let sides: SideOptionSet = .left

but not:

let sides: Set<Side> = .left // 🛑 Type 'Set<Side>' has no member 'left'

does feel like a bug. IMHO the two should behave the same way (whatever it is).

Tony_Parker · March 8, 2023, 12:10am

Is there an affordance for adding availability to new options as the option set evolves?

Douglas_Gregor · March 8, 2023, 6:09am

Joe_Groff:

If attached macros are able to recognize the use of other macros within the body, would it be possible to do something like:
@OptionSet struct ShippingOptions {
  @Option static var nextDay, secondDay, priority, standard
}
where the @Option annotation takes care of doling out raw values for the options, and choosing the appropriate underlying bits type?

I went ahead and implemented something similar to your idea. The result is this:

@OptionSet<UInt8>
struct ShippingOptions {
  static var nextDay: ShippingOptions
  static var secondDay: ShippingOptions
  static var priority: ShippingOptions
  static var standard: ShippingOptions

  static let express: ShippingOptions = [.nextDay, .secondDay]
  static let all: ShippingOptions = [.express, .priority, .standard]
}

The implementation wasn't hard, and turns each of those non-initialized static properties into computed properties, e.g.,

static var nextDay: ShippingOptions {
  get {
    Self(rawValue: 1 << 0)
  }
}

One of the nice things about @Joe_Groff 's formulation here is that you can go ahead and put availability on the static variables, along with comments, access control, and anything else. This is one of the advantages of this "fill in the details" approach vs. what was originally proposed.

Doug

pyrtsa · March 8, 2023, 6:33am

I think it would be really neat if the conformance-synthesising macros could be used at the call site as if they were just @-prefixed protocol names:

struct ShippingOptions: @OptionSet<Int> {
  static let nextDay: Self
  static let secondDay: Self
}