[Pitch] InlineArray type sugar

Could you give an example (that compiles) on how to define an operator Type ** n that would accept Int ** 5? Just for the curiosity. If there is I understand why x makes a better choice (for some reason, I've confused it with latex symbol \times), and didn't realize why x was chosen. Any way, x is wrong mathematically, a more . Perhaps xx would be a better choice. (Or ar - it is intuitive as it can be treated both for array and arrow - then n ar T would be better than [n x T] - since it is wrong mathematically.

Is the whole idea behind using x is to avoid using operator notation?

No it is not.

If you have an apple and another apple, then you have two apples, not an apple squared.

It is wrong. The type (Int, Int) is mathematically Int x Int, not Int + Int.
An ordered pair of apples is an element of the set Apple x Apple which is exactly "Apple squared". Anyway, there's no point to argue against established mathematical notation.

something like

enum IntSum {
    case first(Int)
    case second(Int)
}

represents the type of Int (disjoint union) Int which is "sort of" Int + Int

Historically, tuples have been called product types and discriminated unions sum types...

I am speaking as a mathematician, with a degree in mathematics, and I want to be abundantly clear when I say this:

Two things can be true at the same time.

• • •

The notation ℤ² does indeed mean “the set of ordered pairs of integers”. That is absolutely correct. And the existence of that notation has no effect whatsoever on the validity of other notations for the same concept.

Multiplication is the standard way to indicate multiplicity, and it is immediately understood by everyone, even schoolchildren who have not yet learned how to actually multiply numbers.

Two different notations can mean the same thing, and that does not make either one of them wrong.

1 Like

Also, speaking as a mathematician, with degree in mathematics, I stay behind my claim. The type of ordered $n$ tuples of elements of $T$ should always be $Tⁿ$ and not $n x T$.

However, since it seems like we're set on using [] notation for it anyway, [T**n] and notations suggested like [n -> T] do not make more sense than [n x T]. However, may be something else can be used instead of x? Since an operator can't begin with : may be something like [n :: T] or [n :-> T] can be used instead of [n x T]?

1 Like

This is really going off on a tangent. Formally in a type algebra, (A, B) is a product type, so tupling is exponentiation. Outside of type theory, in pragmatic programming, + is frequently used for concatenation, which makes * or x defensible for tupling.

There's a principled type-theory argument that eliminates multiplication, but Swift already uses + for concatenation, which makes multiplication-like notation appropriate based on pragmatic considerations, unless you have a time machine to go back and fix +. So we cannot just appeal to first-principles here.

I would like whatever syntax we settle on to be clear and generalizable, ideally to both repeated values and multidimensional arrays (with repeated values being probably the more important consideration).

I don't love x as a symbol, because although it's fairly uncommon to use x as a name for an array count (though it does happen), it's very common to use it as a name for a repeated value in an array, which I would like this syntax to eventually generalize to (as Ben mentions): [5 x x] is ok, but it's reasonable to hope that we can do better.

It's probably worth talking a little bit about Fortran. Fortran is the gold standard for "array syntax sugar," 60 years later, with its flexible and concise (multidimensional) array syntax.

integer :: myArray(5)        // array of five integers
real :: otherArray(3,4)      // 3x4 array of floats

I don't think that we need to be quite this concise in Swift; for one thing, Swift is not oriented around concise array computation like Fortran is. For another thing, Fortran's doesn't generalize to values, which I think is a pretty desirable future direction¹.

So, some starting list of constraints that seem defensible from my perspective:

  • It should be as minimally source-breaking as possible.
  • Joe has argued, fairly convincingly, that the count should precede the type, mirroring the unsugared declaration. This rules out Type[n] and variants (including directly-copying the Fortran syntax).
  • It should visibly relate to existing array or tuple syntax (i.e. involve either () or []). Unlike Tony, I don't think that a tuple association is something that we should to avoid, and it might help us get out of the box that we seem to be stuck in. InlineArray is extremely tuple-like (HomogeneousTuple was a working name for the concept for years).

¹ A Fortran programmer might use an implicit-DO construct instead, which is more powerful, but maybe a little bit too magical: [(n, i=1, 5)] // n five times

6 Likes

Just wondering, would of solve all of mathematical notation concerns?

1 Like

of isn't any more mathematically precedented than any other symbol, but I also don't think there's a significant mathematical "concern" with x. Once you use + the way Swift does, something that looks like times is mathematically justified.

If we end up choosing a tuple syntax for the type, e.g. something like (5 x Int), then we should probably also change the literal initialization to use a tuple as well because this would be weird:

let x: (5 x Int) = [1, 2, 3, 4, 5]
5 Likes

Well… The two very robust debaters above seem to suggest they have significant mathematical concern for the usage.

The usage of x here would be introducing a new keyword (which I tend to not like), as would of, however, x has baggage that of does not have. Granted, it's minimal.

The concern I personally have with x (ASCII) is it is a frequent variable name, it happens to mentally be stored in human brains as an operators despite it not being in the operator range in Unicode. of is more descriptive.

(as an aside + is an operator, so it doesn't require as much consideration for clashing with identifiers, it has a little less the baggage of using a new keyword.

Since you can't currently multiply a number by a Type, "*" is acceptable, I suppose, but people seem to feel like that interferes with their intuition.

I just feel like x is being pushed because it is pretty.

I think this really needs to be discussed, the brackets should NOT be required all.

The operator should define the type not the brackets. The [] should define it as an array.

Compare

let x: 5 of Int = [1,2,3,4,5]

with

let y [5 of Int] = [[1,2,3,4,5], [10,20,30,40,50]]

For this reason alone, [] should NOT be a part of the sugar syntax to define an InlineArray

1 Like

Just a thought… InlineArray<n, T> doesn’t sound top bad on its own (not much worse than Set at least - may be just keep the syntax as it is - and just introduce syntactic sugar for inline arrays up to a reasonable dimension? (If there is a way to do it for all dimensions without too much effort, even better) just to have

InlineArray<n_1, …n_k, T> // equivalent to InlineArray<n_1, IntlineArray<n_2,…, InlineArray<n_k, T>….>

and, if possible, with some syntactic sugar for initialization of multidimensional arrays.

Or may be - perhaps tuple dimensions could be introduced? They are much more natural and solve the syntactic sugar problem good enough.

2 Likes

I think the real sticking point is that Swift uses type names which describe an individual instance of the type, not the set of all possible instances.

For example, String is so named because an instance of that type is a string. It is not named Vocabulary or anything that would suggest the set of all possible strings.

This contrasts with, eg. ℤ², which describes the set of all pairs of integers. If an instance of a type is a pair of integers, then Swift would name that type something like IntPair, because that’s what each instance is.

So if there were to be a Swift type named ℤ², then each instance of that type would represent the set of all pairs of integers. One could imagine such a type appearing in a topology library, where instances could each be ℤ² with, say, a different metric:

let x: ℤ² = .init(metric: .taxicab)
let y: ℤ² = .init(metric: .supNorm)
let z: ℤ² = .init(metric: .euclidean)

• • •

Also, just to clarify, I am not actually arguing for “x”. As I said previously, I think we don’t need to introduce sugar for InlineArray at the present time.

2 Likes

How would you write the sugared equivalent of the following?

let a = InlineArray<5, Int>(repeating: 0)
// I don't think this flies
let a = 5 of Int(repeating: 0)
// Maybe? But awkward that parens are only required sometimes
let a = (5 of Int)(repeating: 0)
1 Like

Exactly like you suggest:

or

let a = (5 of Int).init(repeating: 0)

because [5 of Int] is an ARRAY of 5 of Int.

So (5 of Int)(repeating: 0) seems to fly just fine. Why would that be better than [5 of Int](repeating: 0)?

How would you express an array of 5 of Int? You'd have to say [[5 of Int]] -- which to me looks like an Array of an Array of an InlineArray<5, Int>.

This is similar to what you'd have to do with any other operators, like with .... in (1...3).lowerBound or (1...3).contains(where: {})

1 Like

Sure, but only because you've chosen to define it that way in your preferred spelling.

I only raise the concern because the class of "types where parentheses are required sometimes" is something that I see frequently frustrate people writing Swift (including myself). The big one that comes to mind is updating P? (where P is a protocol) to use explicit any syntax. You have to write (any P)?, because any P? is misparsed as any (P?). (Of course, we should fix this.)

These decisions can also cause problems for tooling that wants to manipulate such types, because they have to consider more context in order to make a correct transformation. swift-format has some logic to recommend (and autorepair) things like Optional<T> with T?, and in order to do an optimal transformation, we have to consider a lot about what shape T has to determine if we need to parenthesize it or not before appending the question mark. It's not just any existential types that have this problem—function types are another, possibly more common one, because the ? appended naïvely would bind to the return type, not the whole function type.

Despite my non-preference for parentheses based on their relationship to tuples in my post earlier, if I were given only two choices: 5 of Int and (5 of Int), I would absolutely go with the one that requires the parentheses.

2 Likes

The thing about making it look like a tuple is that people are naturally going to try to write:

var x: (5 of Int, Bool)

Would this be the same as ((5 of Int), Bool)?

Could I add multiple inline arrays in this way?

var y: (5 of Int, Bool, 5 of String)

Or would I have to wrap each one in parentheses?

var y: ((5 of Int), Bool, (5 of String))

If it's the latter, then we need to start explaining that N of T syntax only applies to homogeneous tuples (why?), where they produce inline arrays.

Maybe that's too much complexity and it would be better not to get tuples involved.

2 Likes

A couple reasons to go with []:

  • It can be subscripted, which you do with []. So that capability of InlineArray would be much more discoverable.
  • The equivalent data structure in other languages uses [], like Rust's [T; N] and C's int foo[N]. IIRC, minimizing surprise for C developers was a big argument for using "array" in the name of this type.

Regardless, it would be almost funny now if after finally choosing InlineArray as the name of the type, the delimiters for the syntax sugar end up being something other than "[]" (which, to me, is the quintessential symbol for arrays in programming), choosing instead to highlight the similarity with tuples by using "()". Parentheses, which are most notably used in mathematical notation for... vectors :stuck_out_tongue:

6 Likes

And as a Swift programmer and not a mathematician, (5 x Int) looks like shorthand for a five Int tuple.

3 Likes