[Pitch] Improved Compiler Support for Large Homogenous Tuples

Ben_Cohen · May 29, 2021, 3:42pm

This is a minor point compared to the fundamental questions about the approach, but I'd prefer to keep the unsafe initializer to just the one that uses a buffer, and instead have the closure-based initializer take (Index) -> Element. Hopefully the optimizer could handle many cases where the initialization can actually be done in-place.

There's also a lot of design space to be explored in initializing these things from other sequences (e.g. should it trap on size mismatch or should there be multiple strategies for handling that, similar to Dictionary's initialization from a sequence of (Key,Value).)

DevAndArtist · May 29, 2021, 3:46pm

[quote="Joe_Groff, post:21, topic:49023, full:true"]

Well I guess that's my miss, I totally forgot that we had that proposal accepted. Forget my part regarding the protocol then.

ensan-hcl · May 29, 2021, 3:48pm

My concern about HomogeneousTuple is, we must create function only for one-length-tuple, if we want to create generic function that takes type of value that conforms HomogeneousTuple.

func someOperationWithHomogeneousTuple<T: HomogeneousTuple>(_ tuple: T) {
    for value in tuple {
        print(value)
    }
}

func someOperationWithOneLengthTuple<T>(_ value: T) {
    print(value)
}

struct Vector<T: HomogeneousTuple> {}
Vector<()>          // OK
Vector<(Int)>       // Error
Vector<(Int, Int)>  // OK

To avoid it, I think we should have a parallel discussion of variadic generics.

nikitamounier · May 29, 2021, 5:42pm

I think this is a pragmatic solution to a problem which most likely needs to be solved now. We're definitely a year or two away from upgrading our generics to everything laid out in the Generics Manifesto and in the discussion on Improving the UI of generics, but I think until then this is an appropriate solution – similarly to Unlocking existential types to all protocols.

taylorswift · May 29, 2021, 7:15pm

before we go down the “generic value parameters” path, we need to think about how this is going to fit in with conditional conformances, because that is going to be crucial for the actual usability of this feature. for example, let’s say we want to compute cross products for 2- and 3-dimensional vectors, and we want to factor that into a protocol CrossProductable:

protocol CrossProductable 
{
    associatedtype CrossProduct

    static func >|< (lhs:Self, rhs:Self) -> CrossProduct
}

Currently, the closest thing we have to generic parameterization on element count is using a generic type parameter constrained to conform to SIMD:

struct Vector<Storage, T> 
    where Storage:SIMD, T:SIMDScalar, T == Storage.Scalar
{
}

typealias Vector2<T> = Vector<SIMD2<T>, T> where T:SIMDScalar 
typealias Vector3<T> = Vector<SIMD3<T>, T> where T:SIMDScalar 
typealias Vector4<T> = Vector<SIMD4<T>, T> where T:SIMDScalar 
typealias Vector8<T> = Vector<SIMD8<T>, T> where T:SIMDScalar 
...

a generic type can only have one conditional conformance per protocol, so the following does not work:

// cannot have more than one conditional conformance, 
// even with different conditional bounds!
extension Vector:CrossProductable where Storage == SIMD2<T>
{
    static func >|< (lhs:Self, rhs:Self) -> T
}
extension Vector:CrossProductable where Storage == SIMD3<T>
{
    static func >|< (lhs:Self, rhs:Self) -> Self
}

The workaround is to use a unification protocol on the Storage type, and then write the conditional conformance in terms of the unification protocol:

protocol CrossProductableStorage:SIMD 
{
    associatedtype CrossProduct
    static func cross(a:Self, b:Self) -> CrossProduct
}
extension SIMD2:CrossProductableStorage
{
    ...
}
extension SIMD3:CrossProductableStorage
{
    ...
}

extension Vector:CrossProductable 
    where Storage:CrossProductableStorage 
{
    static func >|< (lhs:Self, rhs:Self) -> Storage.CrossProduct
}

but it’s hard to see how this could possibly work with value parameters, since values are not types, and cannot be extended.

extension 2 as Int:CrossProductableArity
{
    typealias CrossProduct = Int
}
extension 3 as Int:CrossProductableArity
{
    // what goes here??    ----------v
    typealias CrossProduct = Vector<???, count: 3>
}

benrimmington · May 30, 2021, 12:13am

xwu:

—are you hinting at the possibility of supporting not-purely-homogeneous tuples? Certainly, by the way in which you phrase this feature as “sugar expanded before type checking,” one is given to wondering if you intend to support, for example:
typealias T = (flags: Int, 256 x UInt8)

In @xwu's example, only the .flags and .1 through .256 members are available, because T doesn't conform to the HomogeneousTuple protocol. The feature doesn't seem very useful, and it would be better to use a nested homogeneous tuple:

typealias T = (flags: Int, bytes: (256 x UInt8))

I'm also curious about the following sentence:

We would restrict ones ability to only use homogenous tuple spans in tuples that are otherwise single element lists.

Possibly the HomogeneousTuple initializers aren't needed, if there is:

a syntax sugar for homogeneous tuple types and expressions,
a non-default implementation of the withContiguousMutableStorageIfAvailable(_:) method.

Alternative Syntaxes

An attribute-like syntax:

typealias XGA = (@1024x (@768x UInt8))

var xga: XGA = (@1024x (@768x .zero))

Or a macro-like syntax:

typealias XGA = #tuple(
  repeating: #tuple(repeating: UInt8, count: 768),
  count: 1024
)

var xga: XGA = #tuple(
  repeating: #tuple(repeating: .zero, count: 768),
  count: 1024
)

The count parameter might have a default argument when the tuple type is known?

UPDATE: Or labelled with a range.

// Tuple types labelled with closed or half-open ranges.
typealias uuid_t = (0...15: UInt8)
typealias uuid_t = (0..<16: UInt8)

// Tuple expressions labelled with partial or unbounded ranges.
let _: uuid_t = (0...:  .zero)
let _: uuid_t = (...15: .zero)
let _: uuid_t = (..<16: .zero)
let _: uuid_t = (...:   .zero)

Could a marker protocol be used?

@_marker protocol Tuple {}
@_marker protocol HomogeneousTuple: Tuple {}

extension HomogeneousTuple where
Self: MutableCollection & RandomAccessCollection { /*...*/ }

Michael_Gottesman · May 30, 2021, 4:52am

A subsequence of a homogeneous tuple would be another tuple.

No.

Yes.

I am not taking a hard position on this. I do think that in the short term and for this proposal we want this to be something that only the stdlib can do. That is part of what makes adding HomogenousTuple as a protocol interesting is that it allows for the protocol part of the implementation to not have to be written in assembly. We can still delegate to the runtime when necessary though. So it yields a nice separation of concerns.

Michael_Gottesman · May 30, 2021, 4:55am

Follow-up question. Based on this naming choice—
Michael_Gottesman:
type-pure-homogenous-tuple:
    '(' type-homogenous-tuple-span ')'
—are you hinting at the possibility of supporting not-purely-homogeneous tuples? Certainly, by the way in which you phrase this feature as “sugar expanded before type checking,” one is given to wondering if you intend to support, for example:
typealias T = (flags: Int, 256 x UInt8)

This possibility fell out of the implementation when I wrote the sugar. That being said I think it distracts from the goal of this pitch: to be laser focused on the needs of system programmers not general interesting language constructs so I took it out.

In fact the initial branch I put together allowed for one to write this:

typealias T = (3 x Float, 2 x Int, String, Double)

Michael_Gottesman · May 30, 2021, 4:58am

No. I would make that an error.

I don't really have strong feelings about the syntax beyond what JoeG mentioned earlier about the number being first.

No.

Michael_Gottesman · May 30, 2021, 5:01am

This is meant to be a pragmatic incremental approach that improves the lives of low level programmers. Perfection can be the enemy of the good. Waiting for generics over values is ignoring the issue of now.

In terms of the usage of 'x' and the general syntax, I am not taking a strong position on such things beyond the issue that JoeG mentioned.

Michael_Gottesman · May 30, 2021, 5:07am

allevato:

This feature is a long time coming; but does the sugar as described here go far enough to meet the performance goals?

Michael_Gottesman:

This sugar is expanded before type checking, so beyond the homogenous bit that
we set in the tuple type itself, the rest of the compiler just sees a normal
tuple and thus will be minimally effected.

Does this mean that if I declare let x: (5 x Int), the sugar isn't carried over into SIL/IR and it'll be equivalent to (Int, Int, Int, Int, Int)? Likewise, will an imported C type int[5] have the same IR representation as it does today?

A situation a colleague ran into a while back involved importing a C module with a struct that had a fairly large two-dimensional array, where I think each of its dimensions happened to fall right under the ClangImporter limit:
struct foo {
  int bar[4096][4096];
};
We found that when this struct was used from Swift, compile time slowed to a crawl, but type checking wasn't the problem in this case—it was the SIL -> IR and IR -> machine code lowering.

The problem is that Swift's lowering of the type doesn't take any advantage of the multiplicity. When that struct is used from C, Clang emits it compactly using LLVM array types:
%struct.foo = type { [4096 x [4096 x i32]] }
Swift, importing the same C type, emits something I can't repeat on this forum because it lists every single element separately:
%Ts5Int32V = type <{ i32 }>
%TSo3foo = %type <{ <{
    <{ %Ts5Int32V, %Ts5Int32V, ... column repeated 4096 times ... }>,
    <{ %Ts5Int32V, %Ts5Int32V, ... column repeated 4096 times ... }>,
    ... row repeated 4096 times ...
}> }>
I actually gave up waiting for the 4096x4096 case to finish compiling and lowered it to 1024x1024. A Swift program that did nothing but zero-initializes that struct, compiled without optimizations, generates 10.5MB of x86_64 machine code after about 15 seconds on my machine. Fortunately compiling with -O turns that into a single bzero call, but we shouldn't have to rely on the optimizer here.

Swift tuples suffer from the same issue today, but most folks wouldn't have the patience to write one long enough. However, this syntax would give users the power to do so easily:
struct Foo {
  var bar: (4096 x (4096 x Int))
}
So a solution to this problem needs to make sure that it plumbs the multiplicity down far enough that it lowers to LLVM arrays; it shouldn't be possible for a simple code snippet like the one above to make the compiler appear to hang.

So this is a bigger issue that I mentioned when I spoke about the ABI issues. Specifically, we eagerly split tuple arguments/results in the Swift function call ABI. We have two options here: one try to use thunking (with mandatory inlining to try to hide any function calls that can not use the newer ABI) or to break the tuple ABI for large enough N. I originally in the proposal suggested breaking the stable ABI for large enough tuples given that (as you mentioned) it is really slow to compile so people probably don't use it. I am still on the fence about it but an ABI break is a big deal and comes with a bunch of consequences that need to be considered.

In terms of the optimizer, we already do not expand large types (the expansion I mentioned above happens at least in SILGen if not earlier), so I do not expect that to be an issue here. The evil destructuring happens much earlier.

Michael_Gottesman · May 30, 2021, 5:09am

grynspan:

Weren't we just talking about this in my temporary buffers pitch thread?

@Michael_Gottesman: Improvements to C array interop are long-overdue, and being able to access homogeneous tuples in a more sequence-like way is something I know I'd use.

I don't love the proposed shorthand syntax though. The use of x, which in any other context is an identifier, doesn't sit well with me. Off the top of my head, a few syntaxes that might express the idea of a fixed-size array better:
var option1: [Int, 10]
var option2: [Int * 10]
var option3: [Int ..< 10]
var option4: Int[10]
I understand this is all syntactic sugar over tuples but I view that as an implementation detail: folks want fixed-size arrays, so let folks work with these values as if they were arrays. Hence, square brackets. If we really want to use round brackets as we do with tuples, fiiiiine... but please give a passing consideration to alternative shorthand spellings for them too.

As I mentioned, I am not really tied to any specific syntax beyond what JoeG mentioned earlier around the count being first. We could use a *. One aspect of prior art is that in LLVM-IR LLVM array types use the x like I did in this syntax suggestion.

Michael_Gottesman · May 30, 2021, 5:16am

If you are talking about a fixed size Swift.Array, one must realize that this is actually solving a different problem. The specific problem is that Swift doesn't have a primitive to express int x[5] in a nice way beyond tuples (hence the usage in the clang importer but also the pain involved). That is where the proposal started from the acknowledgement of that and then picks low hanging fruit.

Additionally, if we were to do a fixed size array I believe it would actually be a SmallArray that stores elements inline and moves them to the heap if it gets too large. That is a much different thing.

That is a much bigger project with many dependencies. This is trying to be pragmatic, incremental, yet solve real problems for stdlib/system programmers. See JoeG's response around Rust from earlier.

Michael_Gottesman · May 30, 2021, 5:20am

Ok. I am fine with that. I was just trying to match the other initializer. It was an arbitrary choice! Thanks for the feedback!

ensan-hcl · May 30, 2021, 5:36am

Thank you for clarifying.

Does (1 x T) conform to HomogeneousTuple? If so, from now on all types conform to HomogeneousTuple and therefore RandomAccessCollection and MutableCollection, because in Current Swift one-length-tuple does not exist and (T) == T.

However, excluding (1 x T) from HomogeneousTuple makes another problem as I mentioned here. Making it @_marker protocol as @benrimmington mentioned is a bit better because we cannot use marker protocol as constraints. But I doubt marker protocol can be used in this case as written in the proposal:

They cannot have requirements of any kind
They cannot inherit from non-marker protocols

I think we should use other way than HomogeneousTuple.

Nickolas_Pohilets · May 30, 2021, 10:09am

But of what size? This requirement from RandomAccessCollection implies that Subsequence is of variable size:

subscript(bounds: Range<Self.Index>) -> Self.SubSequence { get }

Also note that subsequence is required to share indices with parent sequence.

Some options I can think of:

Return a wrapper containing entire original tuple as a backing storage. No type-erasure or heap-allocation needed.

let x: (4096 x Double) = …
let y: TupleSlice<(4096 x Double)> = x[i..<j]
assert(MemoryLayout<TupleSlice<(4096 x Double)>>.size == 4096 * 8 + 2 * 8)

Return Swift.ArraySlice, Only relevant data is copied, but requires heap allocation.

let x: (4096 x Double) = …
let y: ArraySlice<Double> = x[i..<j]

sighoya · May 30, 2021, 10:47am

taylorswift:

but it’s hard to see how this could possibly work with value parameters, since values are not types, and cannot be extended.
extension 2 as Int:CrossProductableArity
{
    typealias CrossProduct = Int
}
extension 3 as Int:CrossProductableArity
{
    // what goes here??    ----------v
    typealias CrossProduct = Vector<???, count: 3>
}

Value generics are orthogonal to the example you presented as your example needs singleton types to work for that.

With value generics you can do that:

struct Vector<T,n:Int> {...}
extension Vector<Int,2>:CrossProduct {...}
extension Vector<Int,3>:CrossProduct {...}

Doesn't it suffice to solve your problem? I'm sorry, but I'm too inexperienced to fully understand your example.

rsebbe · May 30, 2021, 2:33pm

This is a use case were C has been superior to Swift for many years, mostly just because it was set aside although this is a basic yet useful feature, and not just for system developers. Glad to see it finally addressed.

Comments:

Tuple vs. array syntax: I understand the hesitation, it is an array implemented as (& compatible with standard) tuple. No strong opinion on that.
The syntax: I don't like x being used as a symbol of the language. * seems preferable (or some other syntax).
C's multidimensional arrays (int[20][10]) need to be addressed and be intuitive.

Thank you so much for shedding light on this & for the pragmatic approach

allevato · May 30, 2021, 4:47pm

If there are nested tuples, would "large enough N" be computed based on a single dimension, or based on the total number of elements after expansion? If it's only per-dimension, then it would always be possible to nest tuples to some degree just under that limit and end up with poor compile time and code size characteristics.

I can imagine multi-dimensional homogeneous tuples (or even large single-dimensional ones) being used in high-performance mathematical/scientific computing applications to avoid heap allocations associated with arrays, and if the generated code for those is huge in non-optimized builds and expansion of the type representation causes compiler performance issues, then that's going to be a big obstacle for adoption (and a very confusing class of issues to debug/work around).

So it's hard to imagine a scenario where some ABI breaks aren't on the table to fix this. Otherwise, if the current ABI has to stay unchanged, I think introducing this homogeneous tuple syntax would do more harm than good and fixed-length arrays as a new concept should be created instead—but could imported C arrays be migrated over to use those without raising similar compatibility concerns?

sveinhal · May 30, 2021, 5:50pm

I generally like this, although I personally haven't used C imports that often.

However, I really want us to reconsider using (5 x Int) as the syntax. I don't think we have any precedence of using letter symbols in this kind of way in Swift. I think we should use some operator-style symbol. (5 * Int) comes to mind, but any easily typed non-letter symbol would be fine by me.