String vs Collection, ambiguity, slice vs split

A couple of days back, I've jumped into a pretty weird situation when trying to perform a simple processing action on top of very simple String literal. Consider this code:

"abc def ghi jklmno"
.split(separator: " ")
.filter({ (substr) -> Bool in
    return substr.count == 3 })
.joined(separator: ", ")

Long story short, Swift 5 fails to compile the code, stating the use of .count as ambiguous.

One might ask – oh dear, such a simple operation flow should be all understandable, pretty obvious and clear to be performed, right?

Well, there's been a lot to discuss in my Bugs issue report ([SR-10065] `Substring.count` is ambiguous inside a `[Substring].filter` block · Issue #52467 · apple/swift · GitHub).

TL;DR: Conflicting implementations of .split method taken from both Collection and String, firing a cascade of ambiguity and sadness in my eyes with such a stupid-simple piece of code.

Workaround for this is pretty easy, yet it's not very nice, non-intuitive and hard-to-explain why do we need to use such tricks (like explicit type definition somewhere in the process) because the language allows many vague situations to happen.

My proposals in this particular situation:

  • moving all Collection.split functionality to newly introduced slice method returning a [WhateverSlice<…>] or a similar type
  • reserving split for results collecting the same or inherited type arrayed (String[Substring] in this case, Int[Int] etc.)
  • allowing String and its descendants use split with a separator of String type, not only Character – 'cause we just need a simple way to break a string using another string, right? As easily as we can invoke joined on a String collection using the string.

Further stuff to consider: disallow duplicate method signatures which only differ in a return type – this only leads to ambiguity, unpredictable behaviour, potential side channel to security threats by using class extension poisoning, and a need to type-determine otherwise clear and working code, making it less friendly to programmers.

Even if we wanted to do these, we couldn't; they would break both source and binary compatibility.

(The other one is just an additional overload.)

Well, it's still possible as 5.0 isn't out yet. Making huge String changes while reducing its ease of use right before freezing ABI will possibly make this un-fixable (at least not easily, but more like no way) forever. Refactoring something so mandatory and principal like basic type without having enough space for a fixing release seems a bit unwise in this context. I've honestly attempted to be reasonable and constructive, maybe it could be worth thinking about the issue, its impact and ways to improve the situation. :slight_smile:

It really isn't possible to change 5.0 at this point. A release for a project as large as Swift is like a freight train—once the momentum builds up, you can only slam on the brakes in an emergency, and it will still take miles to stop.

For comparison, three weeks ago we barely got SE-0241 in after severely downscoping it to the bare minimum necessary to prevent shipping code using a fairly popular API from silently changing behavior when recompiled. The changes you suggest could not have been made then, let alone now.

1 Like

Yeah, I get it :slight_smile: but it's kinda pitty that stuff like this couldn't be caught in time, more am I surprised nobody run into this issue to think about it before. Swift 5 is publicly available in Xcode 10.2 for about a month or so, and where most of the developers wait for it to be distributed with it to test Swift 5 in practice (as by far not everyone is into compiling it from sources or testing some 3rd-party bundle), that gives quite too little time to stand up with issues, smaller or bigger.

How's the technical opinion over this? I expected the “ABI freeze” argument, but I guess the discussion could bring some ideas and solutions as well. Is there any chance of improving this specific case in future?

I can really feel that ABI freeze to be premature and more hurting with similar (possibly even bigger) errata coming in the near future as 5.0 gets out, coming as issues almost- or impossible to be ever improved.

Overloading by return type is an intentional feature of the language, not a bug. There are a variety of ways you can disambiguate, including using 'as'.

My impression is that an ABI is like a novel—you're never really finished, you just have to stop at some point and live with what you've got. If you wait for it to be perfect, you’ll never be done.

Honestly, ABI is the smaller obstacle here; we could leave deprecated versions in place. Source compatibility is the bigger issue. We don't rename things without really good reasons, and a slightly weird set of overloads isn't a really good reason.

If you want to solve this problem, I would look for a solution which will break as little existing code as possible, such as adding new overloads.

1 Like

As how I understand from the basic visual type check on String.split in Swift 4.2 in Xcode 10.1, all variants used to return [Substring] result, thus changes in Collection's implementation of split probably introduce some back-compatibility break for an existing code with no explicit type specification.

For worse, this is not even enforced (= detectable by recompiling) as the original bug report documents as the ambiguity error isn't even detected until some conditions are met – leaving the simple String.split call return [Substring], but performing String.split.filter call silently returns [ArraySlice<…>] and String.split.filter.joined finally catches the issue of ambiguity. I guess both previous sequences should have a predictable result in every step, thus a sane, predictable return type as well.

As now, the decision probably stands like either A) String splitting will stay BC-broken and ambiguous and needed to be fixed with explicit typing or B) Collection splitting might be changed again (BC-braking as well?) but leaving String.split untouched (in term of expected result type) and moved to another, sanely named function – slice returning XYZSlice instances.

Something will be BC-broken anyway. :thinking:

Ad variety of ways to enforce type – yeah, it's possible, but it might be really insane to toggle between completely different types. Re-typing numbers like let i = 5 as Float to ensure the desired type or using let s = "something" as NSString to alter the wrapped/backing type is quite useful in some usecases, but this is more like allowing let a = 5 as Banana or let a = 5 as [MorningBreakfast] which looks very confusing at first sight.