Should Numeric not refine ExpressibleByIntegerLiteral?

rxwei · August 8, 2018, 10:21pm

Problem

The Numeric protocol today refines ExpressibleByIntegerLiteral. This makes sense for scalars, but does not work well with high-dimensional data structures such as vectors and tensors.

Let's think of a scenario where there's a VectorNumeric protocol that refines Numeric. A vector numeric protocol needs a couple of extra operators, particularly arithmetic operators that take a scalar on one side:

protocol VectorNumeric : Numeric {
  associatedtype ScalarElement : Numeric
  init(_ scalar: ScalarElement)
  static func + (lhs: Self, rhs: ScalarElement) -> Self
  static func + (lhs: ScalarElement, rhs: Self) -> Self
  static func - (lhs: Self, rhs: ScalarElement) -> Self
  static func - (lhs: ScalarElement, rhs: Self) -> Self
  static func / (lhs: Self, rhs: ScalarElement) -> Self
  static func / (lhs: ScalarElement, rhs: Self) -> Self
}

Here's a conforming type:

extension Vector : Numeric {
  static func + (lhs: Vector, rhs: Vector) -> Vector {
    ...
  }
  static func + (lhs: Vector, rhs: ScalarElement) -> Vector {
    ...
  }
  ...
  init(integerLiteral: ScalarElement) {
    ...
  }
}

Ok, now let's do some arithmetics:

let x = Vector<Int>(...)
x + 1

This fails because + is ambiguous. It can be either + (_: Self, _: ScalarElement) and + (_: Self, _: Self).

  static func + (lhs: Self, rhs: ScalarElement) -> Self
  static func + (lhs: Self, rhs: Self) -> Self

Possible solutions

Move ExpressibleByIntegerLiteral refinement from Numeric to BinaryInteger, just like how BinaryFloatingPoint refines ExpressibleByFloatLiteral. Numeric will no longer require conforming types to be convertible from integer literals.
Remove overloaded self + scalar arithmetic operators, leaving only self + self. This will resolve ambiguity but makes the vector library hard to use and not match mathematical requirements.

What does everyone think?

jrose · August 8, 2018, 10:30pm

cc @moiseev, @scanon

IIRC Numeric refines ExpressibleByIntegerLiteral mainly for 0, and possibly also for 1. We could have probably gotten away with saying init() produces 0, and maybe doing nothing for 1, but…at this point that would be source-breaking for anyone who's extended Numeric directly. I don't think we can change this.

On the other hand, does it actually make sense for vectors to be Numeric anyway? There's not a natural * for vectors.

hlovatt · August 8, 2018, 11:12pm

Mathematica uses different symbols for the two multiplies and I quite like the clarity that brings, perhaps you could use .* for a dot product and * for a scalar product. Etc. for other operators.

rxwei · August 8, 2018, 11:29pm

Arithmetic operators are element-wise. * would be element-wise multiplication.

In the code example, * means element-wise multiplication. The * that takes a scalar on one side is also element-wise: it multiplies every element of the vector by the scalar.

* as element-wise multiplication for vectors is fairly standardized, as Numpy, TensorFlow and Pytorch all use this operator. In any case, whether * should be element-wise multiplication is orthogonal to this post. Other operators like + and - are still problematic due to ambiguity caused by literal conversion.

rxwei · August 8, 2018, 11:39pm

Numeric doesn't have an init(). It's understandable that BinaryInteger would use 0 because it should be ExpressibleByIntegerLiteral. In the proposed solution 1, BinaryInteger can still refine ExpressibleByIntegerLiteral.

From my earlier discussion with @scanon, it makes sense to conform Vector or Tensor to Numeric since there's nothing scalar-specific in that protocol. But now it's hitting a blocker.

I agree and understand that source breaking is certainly bad. IMO this issue is important for future vector APIs in Swift including simd-related types and Tensor in Swift for TensorFlow. Given that it would be less principled in my opinion to define a separate VectorNumeric protocol that repeats all Numeric requirements except the ExpressibleBy conformance, a change may be necessary.

scanon · August 8, 2018, 11:50pm

It's really not just for 0 and 1. Numeric corresponds roughly to the mathematical notion of a "ring [with unity]" (except for the .magnitude property, which we might consider removing). There's a canonical homomorphism from the integers to every ring with unity (in the language of category theory, Z is the "initial object" in the category).

For any type conforming to Numeric, there's an unambiguous way to interpret any integer literal, uniquely determined by that homomorphism.

Part of the issue that you're running up against here is that vector spaces are not naturally rings (though you can endow them with the element-wise product and turn them into rings, which TF has done), but you don't want that to be the product for all vector-space objects--consider matrices or quaternions, which have their own notions of multiplication and identity.

It makes sense for another protocol to exist, but I think it's probably a weakening of the existing Numeric that only requires the arithmetic operators and zero, and doesn't have magnitude or integer literal conformance. Numeric would then refine that protocol, and Vector or whatever would also refine it, adding an associated scalar type and multiplication and division by scalars.

hlovatt · August 9, 2018, 12:06am

OK. You have a typo in your original post:

That is what mislead me. You mean scalar-self or self-scalar? Though if I had read your post more carefully I would have realised - sorry.

rxwei · August 9, 2018, 12:07am

I actually meant func + (_: Self, _: ScalarElement) and func + (_: Self, _: Self). I'll clarify that in the original post. Thanks!

hlovatt · August 9, 2018, 12:15am

Then in your protocol VectorNumeric you mean:

static func + (lhs: Self, rhs: ScalarElement) -> Self
static func + (lhs: Self, rhs: Self) -> Self // Changed from scalar self to self self.

Yes?

timv · August 9, 2018, 12:24am

Numeric already requires the (Self, Self) -> Self version.

hlovatt · August 9, 2018, 12:29am

Thanks.

@scanon has probably come up with the best solution, splitting Numeric up. That will be backwards compatible and more flexible in the future.

scanon · August 9, 2018, 12:30am

Addressing the slightly-orthogonal point, since I've given it a bunch of thought lately:
* should be element-wise multiplication for (computational) vectors. * should also be the natural ring multiplication for matrices and quaternions and other algebras. The real question, then, is how to spell the element-wise multiplication and division for those things, and increasingly, I think that the answer is "get the .vector[1] view of the data and use the vector operator."

placeholder spelling, to be bikeshedded. But this is just a "forgetful" operation that throws out the type's multiplicative structure, projecting to the vector-space endowed with the elementwise product.

rxwei · August 9, 2018, 12:42am

I meant arithmetic operators that take a scalar on one side. "Self" refers to the vector type. These methods are ambiguous with the (Self, Self) -> Self method when one of the operands is a scalar literal.

  static func + (lhs: Self, rhs: ScalarElement) -> Self
  static func + (lhs: ScalarElement, rhs: Self) -> Self

rxwei · August 9, 2018, 12:51am

I like that!

On the TensorFlow side, I'm inclined to prefer * for Tensor's element-wise multiplication, since that's the widely accepted operator in machine learning libraries. True tensor multiplication could use tensordot(_:) and • (the former is consistent with tf.tensordot). This feels a bit off-topic for this post though.

jekbradbury · August 9, 2018, 12:57am

This is one area where looking at Julia might be helpful, where the correspondences between mathematical notions and protocols in the language relating to numeric/vector/matrix types are a little more fleshed out (although those protocols are mostly not enforced in the type system).

Essentially, Julia has a protocol for "field-like types/numbers" and another one for "module-like types/vectors" that builds on top of it.
The number protocol (implemented by subtypes of Number among other things) includes the following mostly-mandatory methods (with approximate Swift equivalents):

+(x::T, y::T) where T (equivalent to static func + (lhs: Self, rhs: Self) -> Self
-(x::T, y::T) where T (equivalent to static func - (lhs: Self, rhs: Self) -> Self
*(x::T, y::T) where T (equivalent to static func * (lhs: Self, rhs: Self) -> Self
/(x::T, y::T) where T (equivalent to static func / (lhs: Self, rhs: Self) -> Self
-(x::T) where T (equivalent to static func - (of: Self) -> Self
inv(x::T) where T (equivalent to static func reciprocal(of: Self) -> Self
one(::Type{T}) where T (the multiplicative identity; also the result of converting 1 to this type—Julia doesn't have Swift's literal overloading system yet)
zero(::Type{T}) where T (the multiplicative zero and additive identity; also the result of converting 0 to this type)
oneunit(::Type{T}) where T (the additive unit, which is different from the multiplicative identity for types that represent unitful quantities)

as well as comparison operators. There is also a promotion mechanism which requires methods like promote_rule(::Type{T}, ::Type{F}) where {T, F<:AbstractFloat} = F in order to define the behavior of arithmetic operations between T and other types.

Where things get interesting is the protocol for vector/module-like types. Many types can behave both as "scalars"/elements of a ring and as "vectors"/elements of a module, so there's special syntax for lifting an operation into the vector space's underlying field (or module's underlying ring):

The + operation on vector-like types is unambiguously elementwise, as that's the meaning of addition in the context of mathematical vector spaces or modules.
*(x::S, y::T) where {S<:Number, T<:AbstractVector}, where S is the scalar/element type associated with the vector-like type T, is also a natural operation in vector spaces, giving basically the broadcasted elementwise product.
The * operation on vector-like types that are not rings is undefined; on matrices, quaternions, or similar types, it means their natural ring multiplication.
If x and y are both instances of vector-like types, whose element type implements the number protocol, then x .* y performs elementwise multiplication by looping over the element type's * method.
x .+ y also performs elementwise addition; in general, these "dotted" operators perform broadcasted elementwise math for all combinations of shaped collection types and scalars.

Swift/the TF project has already made the perfectly reasonable choice to follow NumPy, TensorFlow, and PyTorch and make plain mathematical operators on vector-like types act elementwise. This means that "ring multiplication" needs a special operator (which is • for now); if we ever want things like polynomials to work generically over any ring (scalars, matrices, quaternions...) then we'd also need • to mean * on scalars and we'd write those polynomials like 2 • x • x + 3 • y. (This is why Julia went the other way, and forced elementwise operations into nonstandard syntax—so that 2x^2 + 3y just works for matrix/quaternion x and y—but of course familiarity for TensorFlow programmers is a strong argument for the other choice.)

scanon · August 9, 2018, 1:01am

Yup, .[op] would be the other reasonable option. TF's current choice makes a lot sense for TF, but it probably doesn't make sense for protocols that end up in the stdlib, because people who are working with 4x4 matrices or quaternions don't want to write • whenever they need to multiply.

My current thinking is motivated by trying to satisfy both camps (ML and what I'll call "geometry") if we can. I think that having a forgetful vector-view can actually work pretty cleanly, because you don't often flip back and forth between interpreting objects as abstract vectors and interpreting them as members of an algebra very frequently. It's much more common to use one interpretation for long stretches of code.

Is Julia's Number protocol really field-ish, or is it really a ring? e.g. are integers Numbers? If integers are Numbers, what is the inverse of 2?

rxwei · September 29, 2018, 5:02am

What would you like to call this intermediate protocol?

scanon · September 29, 2018, 1:49pm

Excellent question. Mathematically, it's a "rng" (the way-too-cute term for a "ring without identity"). This is, obviously, not a very good name for the protocol.

My first thought is to dust off the original name for Numeric, which was Arithmetic. It does a pretty good job of capturing "this thing has the familiar arithmetic operations, but isn't necessarily something you'd think of as a 'number'."

rxwei · September 30, 2018, 5:23am

Arithmetic seems to be the natural choice. It's a little weird in that it only defines operations, without the role of instances of conforming types.

dan-zheng · September 30, 2018, 6:19pm

Do you mean Arithmetic is weird because the name is too semantically general? Or because it doesn't define initializers?

I do like the name Arithmetic.