Overloading Generic Types

(I feel like I’ve already written this... I looked through my sent mail and didn’t see anything, but my sincerest apologies if I started this thread a month ago and forgot about it or something.)

I no longer recall exactly what first made me want to do this (probably something in my on-going “teach the compiler calculus” project), but I’ve been thinking lately that it could be quite handy to overload types themselves based on the value of generic parameters. As a somewhat contrived example:
struct Array <T> { /*everything exactly as it is now*/ }
struct Array <T> where T == Bool { /* packs every 8 bools into a UInt8 for more efficient storage */ }

We can already do this with functions… Conceptually this isn’t any different. As long as the specific version exposes everything the generic version does (easy for the compiler to enforce), I think everything would just work (famous last words). In this example, the subscript function would need to extract the specified bit and return it as a Bool instead of simply returning the specified element. The `Element` typealias would be `Bool` instead of `UInt8`, which would mean the size/stride might be different than expected (but that’s why we have `MemoryLayout<>`).

Anyway, because generic structs & functions already can’t make assumptions about a generic argument (beyond any constraints, of course), I think this should be in phase 2… but I’m quite hazy on how generics work once the code’s been compiled, so maybe not.

- Dave Sweeris

I don't currently have a use for it, but I can certainly see how this might
be useful to some people.

As a side note, though, it seems like the consensus is that the
optimization shown in your specific example, which is provided by
std::vector<bool> in C++, is now widely regarded as a poor design choice.
See, for instance, <
c++ - Why isn't vector<bool> a STL container? - Stack Overflow,
<https://isocpp.org/blog/2012/11/on-vectorbool&gt;, and <
http://www.gotw.ca/publications/N1211.pdf&gt;\.

It would be interesting to see if you can come up with a use case where the
proposed feature is a clear win as opposed to just having a totally
separate type that makes clear it's not quite the same thing under the hood
(in your example, something like `BitArray`).

···

On Fri, Dec 23, 2016 at 3:32 PM, David Sweeris via swift-evolution < swift-evolution@swift.org> wrote:

(I feel like I’ve already written this... I looked through my sent mail
and didn’t see anything, but my sincerest apologies if I started this
thread a month ago and forgot about it or something.)

I no longer recall exactly what first made me want to do this (probably
something in my on-going “teach the compiler calculus” project), but I’ve
been thinking lately that it could be quite handy to overload types
themselves based on the value of generic parameters. As a somewhat
contrived example:

struct Array <T> { /*everything exactly as it is now*/ }
struct Array <T> where T == Bool { /* packs every 8 bools into a UInt8
for more efficient storage */ }

We can already do this with functions… Conceptually this isn’t any
different. As long as the specific version exposes everything the generic
version does (easy for the compiler to enforce), I *think* everything
would just work (famous last words). In this example, the subscript
function would need to extract the specified bit and return it as a Bool
instead of simply returning the specified element. The `Element` typealias
would be `Bool` instead of `UInt8`, which would mean the size/stride might
be different than expected (but that’s why we have `MemoryLayout<>`).

Anyway, because generic structs & functions already can’t make assumptions
about a generic argument (beyond any constraints, of course), I *think*
this should be in phase 2… but I’m quite hazy on how generics work once the
code’s been compiled, so maybe not.

- Dave Sweeris

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

1 Like

(I feel like I’ve already written this... I looked through my sent mail and didn’t see anything, but my sincerest apologies if I started this thread a month ago and forgot about it or something.)

I no longer recall exactly what first made me want to do this (probably something in my on-going “teach the compiler calculus” project), but I’ve been thinking lately that it could be quite handy to overload types themselves based on the value of generic parameters. As a somewhat contrived example:
struct Array <T> { /*everything exactly as it is now*/ }
struct Array <T> where T == Bool { /* packs every 8 bools into a UInt8 for more efficient storage */ }

We can already do this with functions… Conceptually this isn’t any different.

Actually this is a major complication because of the runtime generics model. Imagine you have a function that takes a T and constructs an Array<T>. Now it has to do dynamic dispatch to find the right type metadata (is it the “generic” Array<T>, or the specialized Array<Bool>?)

As long as the specific version exposes everything the generic version does (easy for the compiler to enforce), I think everything would just work (famous last words).

What if another module defines an extension of Array<T>, but not ‘Array<T> where T == Bool’?

In this example, the subscript function would need to extract the specified bit and return it as a Bool instead of simply returning the specified element. The `Element` typealias would be `Bool` instead of `UInt8`, which would mean the size/stride might be different than expected (but that’s why we have `MemoryLayout<>`).

Anyway, because generic structs & functions already can’t make assumptions about a generic argument (beyond any constraints, of course), I think this should be in phase 2… but I’m quite hazy on how generics work once the code’s been compiled, so maybe not.

Another way of modeling this might be to define a protocol, say “RepresentableInArray”, which implements get and set methods that take a pointer to a buffer, and an index. A default implementation in a protocol extension would just load and store the value. ‘Bool’ would define its own conformance which performs bitwise operations.

However I think this is out of scope for stage 2 — it doesn’t fix any obvious major shortcoming in the language, it vastly complicates the implementation and it’s not required to achieve our ABI stability goals.

Slava

···

On Dec 23, 2016, at 12:32 PM, David Sweeris via swift-evolution <swift-evolution@swift.org> wrote:

- Dave Sweeris

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

I'm starting to think my original motivation might’ve had something to do with normally needing storage for both some property and a T, but not needing (local) storage for said property if T conformed to a protocol which itself already required conforming types to have that property? Or maybe as a way to have the “bottom” variable in a hierarchy of wrapper types to break the “reference cycle” for some property? (Come to think of it, those could be the same thing)

That was (and is) an odd project... Anyway, it's been a while since I've thought about it. I'll try to find time today to poke through some old code and see if still have a copy of what I’d gotten stuck on.

And thanks for those links :-)

- Dave Sweeris

···

On Dec 23, 2016, at 2:02 PM, Xiaodi Wu <xiaodi.wu@gmail.com> wrote:

I don't currently have a use for it, but I can certainly see how this might be useful to some people.

As a side note, though, it seems like the consensus is that the optimization shown in your specific example, which is provided by std::vector<bool> in C++, is now widely regarded as a poor design choice. See, for instance, <c++ - Why isn't vector<bool> a STL container? - Stack Overflow, <https://isocpp.org/blog/2012/11/on-vectorbool&gt;, and <http://www.gotw.ca/publications/N1211.pdf&gt;\.

It would be interesting to see if you can come up with a use case where the proposed feature is a clear win as opposed to just having a totally separate type that makes clear it's not quite the same thing under the hood (in your example, something like `BitArray`).

This isn’t what I’d been thinking of earlier, but what about Optionals? I seem to recall reading somewhere that an Optional<Unsafe*Pointer> took up the same amount of storage as the Unsafe*Pointer itself because the compiler could use Unsafe*Pointer's null-pointer value to represent the `.none` case. What if we extended that to any type that’s ExpressibleByNilLiteral & Equatable? I *think* we’d have to introduce the feature of “computed cases” to actually implement it without compiler magic (and getting rid of compiler magic for Optional<Unsafe*Pointer> would be half the point, in this instance):

enum Optional<T> : ExpressibleByNilLiteral {...} // Same as now
enum Optional<T> : ExpressibleByNilLiteral where T: ExpressibleByNilLiteral & Equatable {
    case some(T) // Only one case, so storage-wise, nothing extra is needed
    init(_ value: T) { self = .some(value) }
    init(nilLiteral: ()) { self = .some(nil as T) }
    /* "var case `label`" declares a "computed case", a case that doesn't needs
     * any extra bits to "store".
     * In the return value, `.matches` tells the pattern matcher if there's a
     * match, and `.associatedValue` carries any associated values. It probably
     * shouldn't even exist in this example, since we're modeling `.none` rather
     * than `.none(Void)`, but Swift doesn't currently allow for single-element
     * tuples. In practice, all this means is that we'd need to consider `case
     * label(Void)` to be the same as `case label`.
     */
    var case none: (matches: Bool, associatedValue: Void) {
        switch self {
        case .some(let value): return (value == nil as T, ())
        }
    }
}

- Dave Sweeris

···

On Dec 23, 2016, at 3:28 PM, David Sweeris via swift-evolution <swift-evolution@swift.org> wrote:

On Dec 23, 2016, at 2:02 PM, Xiaodi Wu <xiaodi.wu@gmail.com <mailto:xiaodi.wu@gmail.com>> wrote:

I don't currently have a use for it, but I can certainly see how this might be useful to some people.

As a side note, though, it seems like the consensus is that the optimization shown in your specific example, which is provided by std::vector<bool> in C++, is now widely regarded as a poor design choice. See, for instance, <c++ - Why isn't vector<bool> a STL container? - Stack Overflow, <https://isocpp.org/blog/2012/11/on-vectorbool&gt;, and <http://www.gotw.ca/publications/N1211.pdf&gt;\.

It would be interesting to see if you can come up with a use case where the proposed feature is a clear win as opposed to just having a totally separate type that makes clear it's not quite the same thing under the hood (in your example, something like `BitArray`).

I'm starting to think my original motivation might’ve had something to do with normally needing storage for both some property and a T, but not needing (local) storage for said property if T conformed to a protocol which itself already required conforming types to have that property? Or maybe as a way to have the “bottom” variable in a hierarchy of wrapper types to break the “reference cycle” for some property? (Come to think of it, those could be the same thing)

That was (and is) an odd project... Anyway, it's been a while since I've thought about it. I'll try to find time today to poke through some old code and see if still have a copy of what I’d gotten stuck on.

And thanks for those links :-)

(I feel like I’ve already written this... I looked through my sent mail and didn’t see anything, but my sincerest apologies if I started this thread a month ago and forgot about it or something.)

I no longer recall exactly what first made me want to do this (probably something in my on-going “teach the compiler calculus” project), but I’ve been thinking lately that it could be quite handy to overload types themselves based on the value of generic parameters. As a somewhat contrived example:
struct Array <T> { /*everything exactly as it is now*/ }
struct Array <T> where T == Bool { /* packs every 8 bools into a UInt8 for more efficient storage */ }

We can already do this with functions… Conceptually this isn’t any different.

Actually this is a major complication because of the runtime generics model. Imagine you have a function that takes a T and constructs an Array<T>. Now it has to do dynamic dispatch to find the right type metadata (is it the “generic” Array<T>, or the specialized Array<Bool>?)

Wouldn’t that get resolved at compile time? Oh! Wait, are you talking about this?
func bar <T> (_ x: T) -> String { return "any T" }
func bar <T> (_ x: T) -> String where T: CustomStringConvertible { return "\(x)" }
func foo <T> (_ x: T) -> String { return bar(x) }
and “foo()" always returns “any T” because by the time we get to `bar`, T isn’t known to be constrained to anything at compile time?

In any case, since the point is to optimize the type's implementation rather than change its behavior, I think the only result is that your array maker function would return an “unoptimized” array. I’m not sure… I’d have to think about it some more. And come up with a better illustration, since it’s been pointed out that my original example isn’t a particularly good idea.

As long as the specific version exposes everything the generic version does (easy for the compiler to enforce), I think everything would just work (famous last words).

What if another module defines an extension of Array<T>, but not ‘Array<T> where T == Bool’?

IIUC, that shouldn’t matter because the specific version would have to have the same public interface as the generic version.

In this example, the subscript function would need to extract the specified bit and return it as a Bool instead of simply returning the specified element. The `Element` typealias would be `Bool` instead of `UInt8`, which would mean the size/stride might be different than expected (but that’s why we have `MemoryLayout<>`).

Anyway, because generic structs & functions already can’t make assumptions about a generic argument (beyond any constraints, of course), I think this should be in phase 2… but I’m quite hazy on how generics work once the code’s been compiled, so maybe not.

Another way of modeling this might be to define a protocol, say “RepresentableInArray”, which implements get and set methods that take a pointer to a buffer, and an index. A default implementation in a protocol extension would just load and store the value. ‘Bool’ would define its own conformance which performs bitwise operations.

However I think this is out of scope for stage 2 — it doesn’t fix any obvious major shortcoming in the language, it vastly complicates the implementation and it’s not required to achieve our ABI stability goals.

Fair enough. Is there a stage 3, or does that mean “out of scope until Swift 4.1+”?

- Dave Sweeris

···

On Feb 22, 2017, at 3:00 PM, Slava Pestov <spestov@apple.com> wrote:

On Dec 23, 2016, at 12:32 PM, David Sweeris via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:

Note that this is not how the Unsafe*Pointer optimization works: in that case, the type Unsafe*Pointer is not ExpressibleByNilLiteral, it just has some spare/guaranteed-to-be-unused sentinel values in its representation, that the compiler knows about. This also works for types like `enum X { case A, B }`: an optional like X? (and X??, all the way up to 254 ?’s) is also represented as a single byte, because there’s spare values that the compiler knows won't be used by valid instances of X.

Converting valid instances to .none would break round-tripping: Optional(x)! would sometimes fail, if x was the sentinel value. This seems likely to cause generic code to stop working.

Huon

Example program for the enum optimization:

enum X {
    case A, B
}

typealias O2<T> = T??
typealias O4<T> = O2<O2<T>>
typealias O8<T> = O4<O4<T>>
typealias O16<T> = O8<O8<T>>
typealias O32<T> = O16<O16<T>>
typealias O64<T> = O32<O32<T>>
typealias O128<T> = O64<O64<T>>
typealias O256<T> = O128<O128<T>>

typealias O254<T> = O128<O64<O32<O16<O8<O4<O2<T>>>>>>>
typealias O255<T> = O254<T>?

func size<T>(_: T.Type) -> Int {
    return MemoryLayout<T>.size
}

print(size(X.self), // 1
      size((X?).self), // 1
      size(O254<X>.self), // 1
      size(O255<X>.self), // 2
      size(O256<X>.self)) // 3

(The last is 3 because the compiler currently essentially treats O255<X> like an opaque/all-values-valid 2-byte type, like Int16, and so adds an extra byte for the extra ?. It could theoretically be 2 if the unused values in the extra byte in O255<X> are reused.)

···

On Feb 22, 2017, at 10:20, David Sweeris via swift-evolution <swift-evolution@swift.org> wrote:

What if we extended that to any type that’s ExpressibleByNilLiteral & Equatable?

(I feel like I’ve already written this... I looked through my sent mail and didn’t see anything, but my sincerest apologies if I started this thread a month ago and forgot about it or something.)

I no longer recall exactly what first made me want to do this (probably something in my on-going “teach the compiler calculus” project), but I’ve been thinking lately that it could be quite handy to overload types themselves based on the value of generic parameters. As a somewhat contrived example:
struct Array <T> { /*everything exactly as it is now*/ }
struct Array <T> where T == Bool { /* packs every 8 bools into a UInt8 for more efficient storage */ }

We can already do this with functions… Conceptually this isn’t any different.

Actually this is a major complication because of the runtime generics model. Imagine you have a function that takes a T and constructs an Array<T>. Now it has to do dynamic dispatch to find the right type metadata (is it the “generic” Array<T>, or the specialized Array<Bool>?)

Wouldn’t that get resolved at compile time? Oh! Wait, are you talking about this?
func bar <T> (_ x: T) -> String { return "any T" }
func bar <T> (_ x: T) -> String where T: CustomStringConvertible { return "\(x)" }
func foo <T> (_ x: T) -> String { return bar(x) }
and “foo()" always returns “any T” because by the time we get to `bar`, T isn’t known to be constrained to anything at compile time?

I mean this:

func foo<T>(t: T) -> [T] {
  return [t]
}

foo(t: false) // is this the ‘optimized’ or ‘unoptimized’ version?
foo(t: 123)

In any case, since the point is to optimize the type's implementation rather than change its behavior, I think the only result is that your array maker function would return an “unoptimized” array.

Ok then, what if I have:

func foo<T>(x: T, y: T) {

}

And I call foo() with an ‘optimized’ Array<Bool> and ‘unoptimized’ Array<Bool>. That won’t work at all, since it’s only passing one type metadata parameter for T, and not one for each value. So which one would it pass?

I’m not sure… I’d have to think about it some more. And come up with a better illustration, since it’s been pointed out that my original example isn’t a particularly good idea.

As long as the specific version exposes everything the generic version does (easy for the compiler to enforce), I think everything would just work (famous last words).

What if another module defines an extension of Array<T>, but not ‘Array<T> where T == Bool’?

IIUC, that shouldn’t matter because the specific version would have to have the same public interface as the generic version.

But with the extension in place, now they have different interfaces, because the optimized version won’t have the extension’s methods in it.

In this example, the subscript function would need to extract the specified bit and return it as a Bool instead of simply returning the specified element. The `Element` typealias would be `Bool` instead of `UInt8`, which would mean the size/stride might be different than expected (but that’s why we have `MemoryLayout<>`).

Anyway, because generic structs & functions already can’t make assumptions about a generic argument (beyond any constraints, of course), I think this should be in phase 2… but I’m quite hazy on how generics work once the code’s been compiled, so maybe not.

Another way of modeling this might be to define a protocol, say “RepresentableInArray”, which implements get and set methods that take a pointer to a buffer, and an index. A default implementation in a protocol extension would just load and store the value. ‘Bool’ would define its own conformance which performs bitwise operations.

However I think this is out of scope for stage 2 — it doesn’t fix any obvious major shortcoming in the language, it vastly complicates the implementation and it’s not required to achieve our ABI stability goals.

Fair enough. Is there a stage 3, or does that mean “out of scope until Swift 4.1+”?

The latter.

Slava

···

On Feb 22, 2017, at 3:39 PM, David Sweeris <davesweeris@mac.com> wrote:

On Feb 22, 2017, at 3:00 PM, Slava Pestov <spestov@apple.com <mailto:spestov@apple.com>> wrote:

On Dec 23, 2016, at 12:32 PM, David Sweeris via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:

- Dave Sweeris

I'm starting to think my original motivation might’ve had something to do with normally needing storage for both some property and a T, but not needing (local) storage for said property if T conformed to a protocol which itself already required conforming types to have that property? Or maybe as a way to have the “bottom” variable in a hierarchy of wrapper types to break the “reference cycle” for some property? (Come to think of it, those could be the same thing)

That was (and is) an odd project... Anyway, it's been a while since I've thought about it. I'll try to find time today to poke through some old code and see if still have a copy of what I’d gotten stuck on.

And thanks for those links :-)

It sounds like what you want is similar to C++ template specialization (also something I’ve been asking for). Another slightly different form you can imagine it taking is:

class Array {
  associatedtype StorageType:Storage

  subscript(index:[Int]) -> StorageType.ElementType { /* … */ }
}

extension Array where StorageType:FloatStorage {
  // define specific variables for this specialization
  // and define behavior for subscript
}

Another example might be (if ints were allowed as generic parameters):

struct Factorial<N:Int> {
  var value { return N * Factorial <N-1>.value }
}

struct Factorial<N:Int> where N == 1 {
  var value { return 1 }
}

let value = Factorial<10>.value

The first example can in theory be done using runtime information (though as stated in my previous posts, I still can’t get it to work correctly in Swift). The second clearly needs to be done at compile time and could potentially benefit from the `constexpr` discussed in the `pure` function thread. A slightly different formulation could be:

constexpr factorial<N:Int>() { return N*factorial<N-1>() }
constexpr factorial<N:Int>() where N == 1 { return 1 }

Right now generics in Swift feel closer to Java’s generics than c++ templates. I think these types of constructs are extremely useful to a language and would disagree with anyone who says they aren’t needed (e.g. look at Boost and Eigen). However, I can also appreciate that adding these features to a language probably should be done with lots of care and thought.

A

What if we extended that to any type that’s ExpressibleByNilLiteral & Equatable?

Note that this is not how the Unsafe*Pointer optimization works: in that case, the type Unsafe*Pointer is not ExpressibleByNilLiteral, it just has some spare/guaranteed-to-be-unused sentinel values in its representation, that the compiler knows about. This also works for types like `enum X { case A, B }`: an optional like X? (and X??, all the way up to 254 ?’s) is also represented as a single byte, because there’s spare values that the compiler knows won't be used by valid instances of X.

Converting valid instances to .none would break round-tripping: Optional(x)! would sometimes fail, if x was the sentinel value. This seems likely to cause generic code to stop working.

Oh, hey, yeah, good point! I should’ve thought about that bit more first. Would a `HasInvalidBitPattern` protocol work?
protocol HasInvalidBitPattern {
    /// An unreachable bit pattern. Very Bad Things are very likely to happen if this represents a valid value.
    static var invalidBitPattern: Self {get} // probably use some private init to create this, or some other mechanism that doesn’t get exposed to the client
}
Then, the special-case Optional definition would be:
enum Optional<T> : ExpressibleByNilLiteral where T: HasInvalidBitPattern & Equatable {
    case some(T)
    init(_ value: T) { self = .some(value) }
    init(nilLiteral: ()) { self = .some(T.invalidBitPattern) }
    var case none: (matches: Bool, associatedValue: Void) {
        switch self {
        case .some(let value): return (value == T.invalidBitPattern, ())
        }
    }
}

Example program for the enum optimization:

enum X {
    case A, B
}

typealias O2<T> = T??
typealias O4<T> = O2<O2<T>>
typealias O8<T> = O4<O4<T>>
typealias O16<T> = O8<O8<T>>
typealias O32<T> = O16<O16<T>>
typealias O64<T> = O32<O32<T>>
typealias O128<T> = O64<O64<T>>
typealias O256<T> = O128<O128<T>>

typealias O254<T> = O128<O64<O32<O16<O8<O4<O2<T>>>>>>>
typealias O255<T> = O254<T>?

func size<T>(_: T.Type) -> Int {
    return MemoryLayout<T>.size
}

print(size(X.self), // 1
      size((X?).self), // 1
      size(O254<X>.self), // 1
      size(O255<X>.self), // 2
      size(O256<X>.self)) // 3

(The last is 3 because the compiler currently essentially treats O255<X> like an opaque/all-values-valid 2-byte type, like Int16, and so adds an extra byte for the extra ?. It could theoretically be 2 if the unused values in the extra byte in O255<X> are reused.)

That’s quite interesting, thanks!

- Dave Sweeris

···

On Feb 22, 2017, at 12:21 PM, Huon Wilson <huon@apple.com> wrote:

On Feb 22, 2017, at 10:20, David Sweeris via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:

K

- Dave Sweeris

···

On Feb 22, 2017, at 3:42 PM, Slava Pestov <spestov@apple.com> wrote:

On Feb 22, 2017, at 3:39 PM, David Sweeris <davesweeris@mac.com <mailto:davesweeris@mac.com>> wrote:

Fair enough. Is there a stage 3, or does that mean “out of scope until Swift 4.1+”?

The latter.

(Regardless of exactly how it could be done, though, my point was that it at least seems like removing the compiler magic around `Optional<Unsafe*Pointer>` would be an application for being able to overload the storage of a generic type.)

- Dave Sweeris

···

On Feb 22, 2017, at 1:22 PM, David Sweeris <davesweeris@mac.com> wrote:

On Feb 22, 2017, at 12:21 PM, Huon Wilson <huon@apple.com <mailto:huon@apple.com>> wrote:

On Feb 22, 2017, at 10:20, David Sweeris via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:

What if we extended that to any type that’s ExpressibleByNilLiteral & Equatable?

Note that this is not how the Unsafe*Pointer optimization works: in that case, the type Unsafe*Pointer is not ExpressibleByNilLiteral, it just has some spare/guaranteed-to-be-unused sentinel values in its representation, that the compiler knows about. This also works for types like `enum X { case A, B }`: an optional like X? (and X??, all the way up to 254 ?’s) is also represented as a single byte, because there’s spare values that the compiler knows won't be used by valid instances of X.

Converting valid instances to .none would break round-tripping: Optional(x)! would sometimes fail, if x was the sentinel value. This seems likely to cause generic code to stop working.

Oh, hey, yeah, good point! I should’ve thought about that bit more first. Would a `HasInvalidBitPattern` protocol work?
protocol HasInvalidBitPattern {
    /// An unreachable bit pattern. Very Bad Things are very likely to happen if this represents a valid value.
    static var invalidBitPattern: Self {get} // probably use some private init to create this, or some other mechanism that doesn’t get exposed to the client
}
Then, the special-case Optional definition would be:
enum Optional<T> : ExpressibleByNilLiteral where T: HasInvalidBitPattern & Equatable {
    case some(T)
    init(_ value: T) { self = .some(value) }
    init(nilLiteral: ()) { self = .some(T.invalidBitPattern) }
    var case none: (matches: Bool, associatedValue: Void) {
        switch self {
        case .some(let value): return (value == T.invalidBitPattern, ())
        }
    }
}

What if we extended that to any type that’s ExpressibleByNilLiteral & Equatable?

Note that this is not how the Unsafe*Pointer optimization works: in that case, the type Unsafe*Pointer is not ExpressibleByNilLiteral, it just has some spare/guaranteed-to-be-unused sentinel values in its representation, that the compiler knows about. This also works for types like `enum X { case A, B }`: an optional like X? (and X??, all the way up to 254 ?’s) is also represented as a single byte, because there’s spare values that the compiler knows won't be used by valid instances of X.

Converting valid instances to .none would break round-tripping: Optional(x)! would sometimes fail, if x was the sentinel value. This seems likely to cause generic code to stop working.

Oh, hey, yeah, good point! I should’ve thought about that bit more first. Would a `HasInvalidBitPattern` protocol work?

This is basically how the optional payload optimization works behind the scenes. It’s not specific to optional, or having a certain payload type, it works for other enums also.

IRGen knows for each builtin type, what the valid and invalid bit patterns are. Basically reference types and unsafe pointers are the ones that are known to have invalid bit patterns (nulls, and for references, we also know the least significant 3 bits are always zero because of alignment). From this it can compute the valid and invalid bit patterns of tuple types and structs that appear in enum payloads also.

Slava

···

On Feb 22, 2017, at 1:22 PM, David Sweeris via swift-evolution <swift-evolution@swift.org> wrote:

On Feb 22, 2017, at 12:21 PM, Huon Wilson <huon@apple.com <mailto:huon@apple.com>> wrote:

On Feb 22, 2017, at 10:20, David Sweeris via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:

protocol HasInvalidBitPattern {
    /// An unreachable bit pattern. Very Bad Things are very likely to happen if this represents a valid value.
    static var invalidBitPattern: Self {get} // probably use some private init to create this, or some other mechanism that doesn’t get exposed to the client
}
Then, the special-case Optional definition would be:
enum Optional<T> : ExpressibleByNilLiteral where T: HasInvalidBitPattern & Equatable {
    case some(T)
    init(_ value: T) { self = .some(value) }
    init(nilLiteral: ()) { self = .some(T.invalidBitPattern) }
    var case none: (matches: Bool, associatedValue: Void) {
        switch self {
        case .some(let value): return (value == T.invalidBitPattern, ())
        }
    }
}

Example program for the enum optimization:

enum X {
    case A, B
}

typealias O2<T> = T??
typealias O4<T> = O2<O2<T>>
typealias O8<T> = O4<O4<T>>
typealias O16<T> = O8<O8<T>>
typealias O32<T> = O16<O16<T>>
typealias O64<T> = O32<O32<T>>
typealias O128<T> = O64<O64<T>>
typealias O256<T> = O128<O128<T>>

typealias O254<T> = O128<O64<O32<O16<O8<O4<O2<T>>>>>>>
typealias O255<T> = O254<T>?

func size<T>(_: T.Type) -> Int {
    return MemoryLayout<T>.size
}

print(size(X.self), // 1
      size((X?).self), // 1
      size(O254<X>.self), // 1
      size(O255<X>.self), // 2
      size(O256<X>.self)) // 3

(The last is 3 because the compiler currently essentially treats O255<X> like an opaque/all-values-valid 2-byte type, like Int16, and so adds an extra byte for the extra ?. It could theoretically be 2 if the unused values in the extra byte in O255<X> are reused.)

That’s quite interesting, thanks!

- Dave Sweeris
_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org <mailto:swift-evolution@swift.org>
https://lists.swift.org/mailman/listinfo/swift-evolution

Hi @Slava_Pestov – That certainly is a good argument against specializations that supersede more generic types, but what about the non-overlapping case? Why does the compiler not allow G to be redeclared below, but does allow g?

// 'A' and 'B' are not subtypes of each other
struct G<T:A> { … }
struct G<T:B> { … }

func g<T:A>(a:T) {}
func g<T:B>(b:T) {}

P.S. – I'm not sure what the right Discourse forum etiquette is here. Revive this thread? Or start a new one?

Generally people like if you start a new thread that summarizes past discussions and links back to previously brought up topics. It generally saves people from having to go through and reread a ton of past conversations.

But since you've already bumped this thread, and it's pretty short, I would just keep this one.