Allow regular letters as operator names

' hasn't been used yet :wink:. Theoretically, # followed by non-keywords should also work.

I appreciate that most of the participants in this forum work very hard to keep communication respectful and informative even when doing so is not the easiest option.

One of the most basic decisions that has to be made about a language is whether its parsing will be context-free or context-sensitive. The general trend in langauge design has been away from context-sensitive parsing (as in C and C++) and toward context-free parsing (Python, Java, C#, Swift, ...).

Context-free languages are not only easier to build tools around (which has been discussed already in this thread), but also enable important quality-of-life features like out-of-order declarations. Swift can parse an operator expression even if the operator being used is declared later in the same file, in a different file of the same module, or in another module entirely. If operator parsing was context-sensitive, then the compiler might need to parse code later in a file in order to understand code earlier in the same file.

Of course, context-free parsing comes with some trade-offs when a language also wants to support familiar syntax for programmers coming from languages in the C family:

There are desirable pieces of syntax where a context-free parser doesn't have enough information to disambiguate things. Operator precedence in Swift is one example, but there's also a bunch of other mess around things like <> for generics in languages like Java and C#.

Compilers typically fall back to one of two strategies to deal with these constructs while retaining context-freedom:

  1. Heuristic parsing rules based on lookahead can do "good enough" disambiguation for cases like telling whether a < represents a less-than operator or the start of an explicit instantiation of a generic (e.g., this is what the C# compiler does, and while it leads to a small number of "gotcha" cases, most programmers don't even know about the disambiguation rules).

  2. The parser can parse a more flexible super-set of the desired language, and leave the disambiguation to later semantic-checking stages when contextual information is known. This is what Swift does for operator precedence, as @allevato has described, and it is also what Java and C# do for a few key cases where type-vs-expression ambiguities exist (e.g., is (F)(x) a C-style cast or a function call?)

Both of these strategies entail trade-offs, and in particular they both make it harder to develop ad-hoc tools. Option (1) makes parsing more complicated because even an ad hoc parser needs to include the heuristic-based disambiguation logic. Option (2) makes it so that just parsing the language isn't enough to build certain useful tools (as evidenced by the way that swift-format needs to introduce some ad hoc precedence logic).

With an awareness of the trade-offs involved, it should be clear that Swift could support using ordinary identifiers as operator names, just as it could support prefix/postfix operators with more flexible spacing, by using strategy (2) above. In the limit the "parse tree" for an expression would no longer be a tree at all, but a flat list of tokens, and actually forming those tokens into useful expressions would be deferred to semantic analysis.

The key point is that this is a sliding scale, and every language design needs to settle at a point along it and deal with the trade-offs. I trust that the choice of where Swift should sit on that scale was carefully evaluated, and the deliberations made are visible in the exacting rules around operator parsing that exist today. A call for altering this design decision would need to be made with a deep understanding of the problem space, the history of the decisions already made in Swift (and their rationale), and some truly inspiring examples of what such a change would enable.

11 Likes

Out of interest - would using a special character like ' (thanks @lantua) to delineate a function as a binary infix operator require semantic analysis? I don't believe so – using the same style of example as @allevato you could still parse

    a 'add' b 'times' c

as

Sequence(
  Identifier("a"),
  BinaryOp("add"),
  Identifier("b"),
  BinaryOp("times"),
  Identifier("c")
)

...without needing to know at the parser stage what 'add' and 'times' actually mean. All the parser needs to know is that '...' indicates a binary operator, so it can form the right structure in the parse tree.

Yes, that kind of escaping keeps the class of operator tokens and ordinary identifier tokens distinct, so the same parsing rules Swift already uses would apply.

There would still be the burden of justifying why b 'times' c is a big enough win over b.times(c) to justify adding new syntax. My undestanding is that the bar for pure "syntax sugar" proposals is much higher than for other classes of feature.

3 Likes

The reason of this is because of a naive assumption that parsing a token stream gives enough information to enable source code transformations. From some of my experience of making quasi-quotation for c++ with libclang, I can assure you that making something more useful than linter needs to access all semantic information of entities, meaning that all of those tools will work better with syntax tree representation, not a token stream.
And even then I don't see why people are hostile to the idea of a single preparse lookup stage; it doesn't have to compile everything, since operator declarations are allowed only at the top level, it only needs to crawl around the code and collect them, which can be done as a regexp expression. AST could also be stored as an XML file or something alike in a swift package, and be modified when source code is changed. This leads to even smaller overhead time for parsing and source code transformations.


Still, the solution has already appeared in this thread: there shouldn't be any additional complexity if operators with letters would have special marking to make distinction clear.
How about ...

infix_operator := ' operator_special_name '
prefix_operator := ' operator_special_name
postfix_operator := operator_special_name '

'not a 'and' 'not b 'and' a 'or' b inverted'
This has only one possible interpretation ...
((((prefix a) infix (prefix b)) infix (a infix b)) postfix)
And this even look somewhat pretty


I understand concerns about incorrect use and flooding code with useless synonymic expressions, but the point of this feature is to hide complexity, not create the other way around.
Function builder already in language, and do exactly that. I don't see why not to continue the expansion of this space. It can potentially bring structural types into language.
Something like this ...

let kernel = ComputeKernel {
    'local Matrix ([[1,2],[1,2],[1,2]])
    'shared Scalar (0.9)
    'procedure { () -> () in }
}

Alternatively, it could be ...

let kernel = ComputeKernel {
    Local(Matrix([[1,2],[1,2],[1,2]]))
    Shared(Scalar(0.9))
    Procedure { () -> () in }
}

But that hurts my eyes :woozy_face:

1 Like

That's not true, unless you also enforce spacing between infix operators and their operands. Today's Swift does not do so.

* Or forbid spaces within an operator name.

It could be that the existing symbolic operator would receive the same treatment, but if tokenizer encounters something with '_ or _' then it resolves these as a special case.

prefix operator 'not // must be with space
prefix operator *-- //must be adjacent to operand

'not true
*--true

Operators are capable of so much more than extensions. For example, it's currently possible to create generic operators on unrestricted parameters, but you cannot extend 'Any' in the same fashion. It came up in a few discussions in the past: Where to start if you want to extend `Any`

infix operator <*>: DefaultPrecedence
func <*> <A, B>(left: A, right: B) -> (A, B) {
    (left, right)
}

You can create similar generic operators that are otherwise impossible to express using extensions.
(My motivation for introducing named operators is - apart from massive readability improvements - that it would unlock new, unexplored areas of writing DSLs.)

2 Likes
Terms of Service

Privacy Policy

Cookie Policy