`NonEmpty` collections support

@lorentey thank you for the feedback. Just out of curiosity, if you had to solve this problem now the best way possible, how would you approach and model it!? It‘s still valuable information when folks from the swift team share their ideas.

I think it's quite a stretch to read that as permission to ignore the documentation, especially when it is so explicit. As with all libraries - using undocumented features, or using features in a way which the documentation explicitly asks you not to, leads to fragile code and is bad practice.

2 Likes

What problem would that be?

If there is something wrong with the NonEmpty implementation, I think it’d be best to bring it to the attention of the NonEmpty project owners! This forum is focused on the Swift Collections project.

That is a bit picky. I can move this thread elsewhere if you feel uncomfortable answering to my question while it's in this sub-category. ;) That's not an issue. (Would Discussion fit better?)

Like literally solving the general "non emptiness" for collections in Swift, not via a 3rd party library. How would you approach this task? I'm asking because you are someone who has a lot of knowledge around collections in Swift and as you mentioned in your previous post already, you think NonEmpty shouldn't be a data structure. So what is then, a protocol, a potential type attribute? I'm fine with straw man and bikeshedding ideas. It's not like I'm intentionally trying to waste yours or someone else's time, I just want to move this topic forward and gather more feedback on it.

At this time I’m not personally interested in researching alternative ways to solve this problem.

The reason for this is that I see no indication that this is a problem that Swift programmers are particularly struggling with.

I also don’t think I’d be able to suggest anything that’s meaningfully better than what NonEmpty already is.

As I see it, taking NonEmpty to the next level would require a huge amount of grueling legwork, including considerable source compatibility churn for Swift users. It was worth investing this effort for Optional (esp since it came built into the first version of the language), but I don’t currently see how it would be worth doing the same for NonEmpty.

(This also means that I don’t think it would make sense to add it to the stdlib; without the platforms embracing it, all that would achieve is to unnecessarily freeze its design.)

Apologies if this is a disappointment.

(Disclaimer: To be very explicit, I don’t intend this to be a categorical value judgment; it also isn’t to be taken as any sort of official rejection of any and all efforts to explain more things to the type system. I am just personally indifferent to this particular case. I reserve the right to change my mind in the future.)

4 Likes

Thank you for sharing your personal take on this topic. :+1:

I'm going to move this thread into a potentially better fitting category just for the sake of a historical documentation around that particular topic and as it does not seem fit the collection package.

2 Likes

Oh, to clear this up: I just meant that NonEmpty isn’t a data structure in the regular meaning of this term — i.e. it isn’t like an array, a hash table, a trie etc.

I wasn’t trying to hint that I have a secret idea for a clever way of doing this without defining a type for it. (In fact, as a library engineer, I tend to err on the side of trying to solve everything by defining new types. :wink:)

6 Likes

+1 for this idea.

I want to provide some real world examples for people, who asking about practical usage.
One of such examples are flights and flights route. Image we want to buy a ticket from Moscow to New York, then from New York to Paris, then from Paris to Singapore.

Typical task is to display departure and arrival. If we have NonEmptyArray, then it is pretty easy:
"(travelRoute.first.departureCityName) – (travelRoute.last.arrivalCityName)"

Another task is to validate that document expiration date is later than arrival to the last city:
document.expirationDate < travelRoute.last.arrivalDate

Another task is calculation the overall trip duration:
let travelDuration = travelRoute.last.arrivalDate - travelRoute.first.departureDate

If we use standard Array, then we have problems with optional .first?.departureDate and .last?.arrivalDate.
Of course, there are workarounds. But all of them are inconvenient.

Keep in mind, that traveling domain has tons of tasks, where non optional first and last elements make code more clean and clear, and reduce its size.

Another example is calculation of average value:
Array().average() - if array is empty, what is the result? Zero is almost always not suitable.

Again, NonEmptyArray solves this problem.

I think this more demonstrates why a NonEmpty collection is of limited use, because what you actually want for a TravelRoute collection is at least two elements, not just one. It's still not clear to me why it's better to try to enforce this in the type system rather than just in an initialiser somewhere. Either way you're going to have to check that the collection you're either trying to wrap in an NonEmpty wrapper or pass to some API isn't empty.

4 Likes

My two cents on this. Limited use does not imply that a concept of non-emptiness does not have high interest. It‘s limited because there is no convenient solution for the whole range of collections yet. We don‘t know how this feature would have been adopted if we already had it built in.

Did people complain in languages without an Optional type that they needed it?

General non emptiness does only need to guarantee that at least one element is present in the collection, anything else would be a different constraint that would also imply that the collection is not empty.

If we would view this problem from another perspective, then you‘re right. If collections had a statically known range boundary then we would have even more guarantees for min and max boundaries of collections.

Something like this: Array<Int, 2 ..< 10>

That would also partly touch the “fixed sized collections“ problem.

Array<Int> == Array<Int, 0 ..< .max> // but conceptually `max` should be `infinity`

Yes, often.

I don't like the analogy to Optional that everyone keeps making because the scale is misleading. For an Optional wrapper either the entire API of the type is available or none of it is. For a NonEmpty collection, you're just changing the optionality of the return type of a small number of methods. This is why I think NonZeroNumber is a better analogy. Okay, it helps you keep track of whether division is defined or not, but most operations on the number are now either disallowed or have to return a number that is possibly zero. The cost-benefit ratio is completely different.

2 Likes

While true, I feel like such a complain that I mentioned originated only from developers who have used the Optional type / construct in a different language / context before. At least this is my personal opinion on that, which I cannot generalize on every developer.

In fact the above concept of a statically known range could also be applied to numeric types, which would also be “nice to have“.

I don‘t remember in what thread, but I believe this idea was brought up in the past and the core team expressed interest in eventually exploring this territory.

To clarify, it's more like the opposite. Most languages only have Optional types and the benefit of having an explicit Optional type is that most things can be non-Optional. Only having nullable references has been called the billion-dollar mistake, but that's probably understating it.

3 Likes

Right, but I would add that it‘s not only about saving us from the nullability pain and errors but also for the logical and conceptual convenient use and static guarantee. I do strongly believe that non-emptiness, in any form it gets solved, would equally improve certain areas and eliminate many pain-points. Personally I would prefer a solution that would cover as many problematic areas as possible.
That said simple non-empty guarantee could be generalized by defining the minimal amount of statically known values inside a collection. However if we make this into a range, we would not only cover non-emptiness, but the “fixed size“ problem as we could finally specify the upper bound of collections as well. And yes, I do understand that with each further generalization of these problems, the complexity for the compiler would rise.

I'd be massively +1 on this. It doesn't come up so often in my Swift code, as that tends to be small personal projects, but in my day job with C#, methods like collection.First(...), collection.Last(...) and collection.Single(...), which explicitly call out the fact that a collection is expected to be non-empty, are all over the place.

Alas, in C#, the compiler cannot check the correctness of these calls, so usages of the above blow up in exactly the same 'whoever got unlucky in the stack trace' way as NullReferenceException. Enabling Swift to enforce this in a type safe way would be a massive win, imho.

2 Likes

"...you actually want for a TravelRoute collection is at least two elements" – small amendment, I want at least one element, because first and last can be the same element.

struct Flight {
  let departureCity: City
  let arrivalCity: City

  let departureDate: Date
  let arrivalDate: Date
}

let travelRoute = NonEmptyArray(flight)
let travelDuration = travelRoute.last.arrivalDate - travelRoute.first.departureDate

"why it's better to try to enforce this in the type system rather than just in an initialiser somewhere" – of course, there are throwable initializers and checks. The problem is that in large application we need to get data from server, save it in local database, do lots of filter-map & map-reduce operations.
So there is no guarantee in many pieces of code, that an Array of flights is not empty, because it is passed from one screen to another and modified, and if developer made an error, then array is empty.
We also have SwiftLint and discourage forced unwrapping. So dealing with Array we are always forced to write:

if let first = flightRoute.first, let last = flightRoute.last { // meaningless because it should always be non empty, but compiler doesn't know it. 
} else {
  // what should we do here? Logically this scope will never be executed. But compiler doesn't know it.
  // And in practice sometimes this code is executed, because programmers do mistakes and pass empty FlightRoute.
}

You can just use subscripts if there's no reasonable recovery path. The runtime checks will cause a crash on an empty array, and hopefully you have good crash reporting, so you can see that accessing element zero failed, meaning an array was unexpectedly empty.

Yes, this approach is possible. But why should app crash due to language limitations, if we have NonEmptyArray, that totally prevent such mistakes?
There are different ways to write code and error handling. My point is that using of NonEmptyArray is the most simple solution, provides compiler guarantees instead of runtime checks and makes such errors impossible.

1 Like

Exactly, if you're doing filtering and reducing then you will be not be able to maintain the type-level guarantee that the resulting collection is still non-empty, so you're going to need to be making the moral equivalent of that runtime test all over again anyway. If you want to make it more concise then you could wrap it up in a firstAndLast: (Element, Element)? convenience property. A NonEmpty collection type can only carry its weight if you're using a collection in an immutable, append-only or reorder-only fashion across a large enough call graph.

1 Like

I think you can also delete elements, but you have the constraint not to delete anything if only one element is available :).