Require Sequence.underestimatedCount be O(1)

The example you are talking about was using a partially-implemented type. IIRC, the problems went away once he implemented Comparable (which is, you know, a pretty fundamental thing).

We can never guard against somebody assuming something isn’t important and not implementing it. Otherwise we can never use protocol extensions.

Luckily Collection’s defaults compose well - and isEmpty has a default which compares startIndex and endIndex. For simple collections, it’s common to make these a computed property; maybe returning a constant or forwarding to a wrapped collection. For complex collections, it’s common to make these stored properties. In any case it’s relatively easy to fulfill the basic O(1) requirement.

If you decide index comparisons are not important and don’t implement them properly, lots of stuff will not work correctly. Nothing we can do about that.

Right; in my view, the scenario in this thread differs in that it's actually worse. Not implementing Comparable conformance was a straight-up error, and where the compiler was deficient is in not having the diagnostics to produce a useful error message. Here,

  1. the compiler can't produce an error about not meeting documented complexity requirements;
  2. we've said, even on this list, that it's OK to implement a type that can't meet a particular protocol complexity requirement and still state the conformance, as long as you document the departure for your users.

So, in the scenario pointed out by @moiseev, doing a thing that (a) can't be diagnosed by the compiler; and (b) is said to be OK as long as you disclose it, would cause the standard library to violate its own rules. Hmm. Choosing to create such a state of affairs wouldn't be a temporary deficiency in compiler implementation but an error in design.

I can't see how you can have this and expect the standard library to bend around you. If subscripting wasn't O(1) basically every complexity guarantee for the algorithms in the standard library would be invalid. That's the price you have to pay for abusing an abstraction.

We're not talking about implementing subscripting incorrectly; we're talking about implementing isEmpty as count == 0, which is both naive and totally something that a programmer would write without a second thought.

I'm not saying that, as an end user, it's a reasonable expectation of the standard library to accommodate deviations from documented semantics. I'm saying that, as designers of the standard library, we should make best effort attempts at making the standard library less brittle when users do (and they will) make mistakes.

1 Like

Sure, that's a noble goal, and it's a great thing to strive for when possible. However, I don't see how this could be possible without penalizing those who correctly conform to Collection's algorithmic guarantees–are you fine with losing these benefits?

...i.e., not accommodating the hypothetical users who misuse underestimatedCount as an alternative for isEmpty? in exchange for decreasing the likelihood of the standard library interacting poorly with end user types because of unintentional O(N) underestimatedCount?

Yes.

I'm undecided regarding whether underestimatedCount should return 1 for non-empty collections. Could you help convince me?

If the real count is not returned, in what scenarios is returning 1 meaningfully superior to 0? If it's to aid exponential growth in growable collections, then this benefit is negligible for non-tiny real counts.

Earlier you mentioned justification where user code has bugs assuming underestimatedCount is not an underestimate. In this case, returning 1 would only help to mask the bug, rather than avert it. Do I misunderstand the scenario?

They're both abuses, but I just realized that I was making the assumption of Collection.count being O(1) so that underestimatedCount would return count, not isEmpty ? 1 : 0. I keep forgetting that we don't have an Iterator or similar so that we have three levels of hierarchy rather than the two we have now.

You can't design stuff, including banning stuff, based on woulda/coulda/shoulda-s. That way lies madness.

Or, as I once read in a C++ FAQ, design your precautions against Murphy, not Machiavelli. Don't make the experience worse for the 99% to futilely engage against the 1% doing it wrong.

And, in your worse case, someone overrides isEmpty to use a O(n) count, which makes isEmpty and an updated underestimatedCount O(n). That's already the current case for underestimatedCount.

Uh,... we do have an IteratorProtcol. Or is the behavior you're thinking about different than what that protocol provides?

But, is anyone really going to also mess around with isEmpty whenever they need to customize count? The default implementation of isEmpty works just fine and is O(1). underestimatedCount shouldn't be used as a quick-and-dirty isEmpty, but it shouldn't have its current implementation in Collection as a quick-and-dirty count either. Having underestimatedCount as isEmpty ? 0 : 1 is a good deal for the 99.9% of users that will never override isEmpty, either maliciously or "helpfully." As I said in a different response, don't anti-optimize to (futilely) block the pathological.

2 Likes

Yeah, I meant more of something that Sequence would inherit from. See my comments on [Pitch] Remove the single-pass requirement on Sequence.

Posted an official proposal here: Add an underestimatedCount proposal by moiseev · Pull Request #819 · apple/swift-evolution · GitHub

Thanks everyone for the discussion!

3 Likes

I don't think the "Alternatives considered" section is correct. You worry about someone making a worse-than-linear override to isEmpty, but that property must be constant (i.e. O(1)), so such overrides are already broken. In other words, a non-broken isEmpty is capable of being used within underestimatedCount, since they're both O(1).

Yes. I remember your comment about it here in the thread. I still believe however that a non-broken underestimatedCount that is conditional on a non-broken isEmpty is worse than an unconditionally non-broken underestimatedCount.

1 Like

I think that anyone still arguing for the isEmpty 0/1 option should address this good question from the thread:

1 Like