[Pitch] Have all adapters in the standard library uniformly expose their `base`

gwendal.roue · April 20, 2022, 9:53am

Now some real stdlib types expose their base, and others do not. One possible interpretation is that some lack of consistency has slipped through multiple code reviews. Another is that some people in the past said "make base public because I need it" (a requirement, not a request). Another is that some types need a public base (not _base) in order to accommodate some inner stdlib needs. A last is that exposing base was decided pragmatically, case by case, based on experience and the expected use cases of each type. This last interpretation is the one that assumes the most effort from the stdlib designers and developers, because working for the users, not for the api, is much harder. Anyway a fight over first principles is interesting, but only to some extent.

Nevin · April 20, 2022, 1:54pm

Something about this line of reasoning seems wrong. Like, objectively fallacious as a sequence of logical steps.

You’re not the first on these forums to espouse a view that, “Users should never conform a type they don’t control to a protocol they don’t control.”

It’s been said a lot. And every time I hear it, it always seems off. The logic doesn’t hold up to scrutiny.

Let’s focus on the case at hand: conforming a standard library type to a standard library protocol.

First, note that this is pragmatically useful. The conformance is useful to the programmer (Dave) who makes it. It is also useful to the standard library maintainers. The existence of a 3rd-party conformance provides data that says, “Hey, there’s a missing conformance here.”

Second, if and when the standard library does eventually add the conformance, the programmer (Dave) is not going to say “Ugh, this breaks my code.” No. Instead they’re going to say “Hallelujah! Finally! I’ve been waiting years for this! I can delete so much boilerplate! This is fantastic!”

The programmer wants the standard library to add the conformance. They are frustrated that is has not yet done so. And they are even more frustrated that the standard library actively prevents them from adding the conformance themself. It is harming their productivity.

Third, even if it does break something when the standard library adds the conformance, that’s a breakage that was known and foreseeable. The programmer (Dave) was reasonably aware of the possibility when they made their own 3rd-party conformance.

Fourth, your particular objection to base exhibits a massive inversion of priority. It amounts to, “Because base is especially useful right now in the absence of a certain conformance, therefore we should not add it until after that conformance is provided.”

It is completely backwards. The fact that the conformance does not exist today, and there are no concrete plans to add that conformance in the near future, is an additional reason why base should exist, on top of all other reasons.

Dave is saying, “There’s a ravine I want to cross, but to build a bridge I need an anchor point.”

And you are saying, “The ravine might eventually get filled in by a public-works project years from now. If that happens you won’t need a bridge, and any bridge you did build would be destroyed. So I think the reason you want an anchor point is terrible. Bad enough to taint the entire idea of installing an anchor point, regardless of what else it might be used for.”

The sequence of logical steps—the reasoning—is bad. Having the anchor point today would be useful. Delaying the anchor point until it becomes less useful, is actively harmful. The sum total of all utility that would have been achieved through using the anchor point (the base property) by all programmers throughout the entire duration of time until the standard library adds the conformance itself, would be lost.

Karl · April 20, 2022, 2:27pm

I agree with the sentiment and will add that I sometimes feel the standard library has more of a focus on currency types with minimum API commitments, and that sometimes I would prefer if it would 'get out of my way', so to speak, and provide reusable types with maximal utility for library authors.

One particularly annoying example is IndexingIterator. It has a trivial implementation but no official constructor, so you can't make an iterator over a custom slice of a collection -- instead, you need to make a slice of the collection, and iterate that. It's a subtle difference, but in our type system an iterator over <T> is not always the same as an iterator over <T.SubSequence>.

The result is that you need to DIY this trivial building block, for like... the most annoying of reasons. It just feels like the kind of thing you shouldn't have to do.

lukasa · April 20, 2022, 4:40pm

The risk is that when the standard library adds the conformance, the programmer is long gone. Dave wrote his application, it's shipped, and two years later Apple ships an update to the OS that adds the conformance. Now we run directly into the problems outlined in Retroactive Conformances vs. Swift-in-the-OS. The programmer today wants the conformance, but they're long gone and their app is busted. As is well documented in the annals of computing history, users will rightly blame the last thing that moved: after all, the app worked before and now it doesn't, and the app sure as hell didn't change.

Another constituency you're missing here is library authors. Library authors absolutely must not add retroactive conformances because, again, they're often not able to fix the problems those retroactive conformances cause. Many libraries become essentially unmaintained, or you are forced by other circumstances to use older versions of them. In this instance, the user of the library may not even have known the retroactive conformance was there, but now it's their problem, and they have the unpalatable choice of either forking the library or removing its use.

Put another way, the issue with retroactive conformances is that they fix today's problem by signing up for a future problem, one which will strike without warning and likely fall on a very different person than the one who benefited from the fix.

This analogy isn't the point, but: in this analogy, the issue is that "would be destroyed" is in the passive voice. A better one would be:

"This ravine might eventually get filled in by a public-works project years from now, at which time we will be unaware that you have built a bridge and so will knock it down by accident in an uncontrolled fashion, potentially causing it to fall on whatever lies beneath it. We will be liable for that destruction, not you, and so we would really much rather you didn't build the bridge and certainly have no intention to make it easier for you."

This, however, I agree with. I agree with @lorentey that exposing _base because we want to use it to implement retroactive conformances is a bad idea. But I think exposing it is a good idea: it enables entire classes of algorithms that are otherwise awkward to implement, it costs very little in flexibility of evolution because the types are frozen anyway, and ultimately I have a "consenting adults" view of API design. But I definitely understand why the stdlib devs are hesitant to expose this implementation detail, and if this were the problem I had I would absolutely just copy the implementation myself and move on.

itaiferber · April 20, 2022, 4:45pm

@lukasa voiced my thoughts much better than I could have myself, and I'm very much in agreement with everything he said. To continue stretching the analogy even further:

A similarly unfortunate situation is "While we were planning out the public works project, someone built a bridge which is now so central and heavily-trafficked that we actually can't go through with the project at all because doing so would require tearing it down, and we can't do that. Unfortunately, you'll need to use the bridge forever."

There's a lot of give and take here; sometimes the balance is really difficult to strike.

lorentey · April 20, 2022, 9:43pm

I'm not disagreeing with that. Ultimately I care very little about absolutist arguments (even my own); I just want to make sure the stdlib is as useful as we can possibly make it, without actively harming its future.

So I'm quietly asking, once again: what are some examples for those "entire classes of algorithms" that will be enabled by the addition of EnumeratedSequence.base?

dabrahams · April 20, 2022, 10:23pm

The retroactive conformance is a red herring. Take any protocol with a non-static requirement, even one I've defined myself—if I want to make EnumeratedSequence<T> conditionally conform to that protocol when T does, not having access to the base makes that conformance impossible.

(The intention to make a retroactive conformance doesn't invalidate anything, though: the problems with retroactive conformances need to be solved anyway, and it is possible to do so with scoped conformances. With the right features in place there's nothing inherently problematic about a retroactive conformance).

dabrahams · April 20, 2022, 10:46pm

lorentey:

As a user of the language, I love that their interface is completely defined by their protocol conformances -- if I understand the protocol, I understand the type.

As a library author, I love that they get rid of the need to invent a workable public name for a result type that no one wants to remember or spell out. In practice, these collection transformations have a tendency to be chained, resulting in deeply nested A<B<C<D,E<F>,G<H,I>,J>,K>> types that aren't nice to work with at all. Opaque result types point us a way to cut through this mess by hiding it all. (Even if only superficially.)

On a more abstract level, I'd find it exciting to figure out how far we can take things within their constraints -- such as the need to describe a type entirely through its protocol conformances. Weird constraints sometimes lead to breakthroughs. (Then again, sometimes they just make things unnecessarily difficult, like trying to argue over the internet. ¯_(ツ)_/¯)
....
Still, I'd love it if we could figure out a way to use these ideas to build an efficient transformation library.

But to be honest, I don't really see how any of this has anything to do with base properties. (An opaque result type could provide that, too.)

If we ignore the conceptual illegitimacy of a protocol that just exposes a base, and especially that of a protocol that exposes both base0 and base1 and has only two models, Zip2Sequence and ConcatenatedSequence… yeah that's true. I'm not quite that hard-line about protocol legitimacy of course: sometimes a protocol is just a useful language mechanism and not a real concept. This would be awkward but not intolerable.

dabrahams · April 21, 2022, 7:42pm

Obviously it's not about this one adapter's base. Consider Should there be BidirectionalCollection.dropLast(while:)? - #16 by dabrahams which would be impossible had we not exposed the base of reverse collection iterators.

tevelee · April 22, 2022, 11:30am

How about exposing base as an SPI? That way Dave can implement the conformance using @_spi(Internal) import Swift, and Karoly does not have to worry about users accidentally escaping the abstraction because it's not visible by default.

dabrahams · April 23, 2022, 1:01am

That's really missing the point I'm afraid. As I hope I’ve demonstrated, the utility of these adapters is compromised for many non-retroactive purposes unless they have visible, publicly accessible bases.

filip-sakel · April 26, 2022, 9:28am

I agree with Dave, not to mention that marking something as SPI to protect users, signals that there’s a more significant underlying problem with not being able to safely add retroactive conformances, etc.

lorentey · April 26, 2022, 7:59pm

One simple (but not necessarily easy) way to make all of us happy is to include the addition of base in the second version of SE-0312, along with any and all sorely missing conformances that are actually possible to implement.

I don't think it would be a good idea to have base as a general API expectation for wrapper types, but for types that are etched in stone to the extent that EnumeratedSequence and Zip2Sequence are, there is little point in insisting on not violating the gossamer abstraction.

dabrahams · April 27, 2022, 4:19pm

SE-312 is reviewed and linked here. @timv what do you think of @lorentey's suggestion?

@lorenty: I honestly can't imagine what problem you think you're preventing by not satisfying the general expectation. I understand that you don't like it, but if we're not going to do it, there should be a real practical reason… do you mind explaining?

tera · April 29, 2022, 2:12am

This is the edge case indeed, but still... This is what I am after, and I believe this is BIG: SwiftUI gives you "a promise" of easily testable UI, without the need to resort to snapshot testing, e.g.:

@main
struct HelloWorldApp: App {
    var body: some Scene {
        WindowGroup {
            MyView(param: "hi")
                .onAppear {
                    runTests()
                }
        }
    }
}

struct MyView: View {
    let param: String
    var body: some View {
        Text("Hello, " + param)
            .foregroundColor(.orange)
    }
}

func runTests() {
    let whatItShouldBe = #"....."#
    let s = MyView(param: "World").body
    print(s)
    assert("\(s)" == whatItShouldBe)
}

Text(storage: SwiftUI.Text.Storage.verbatim("Hello, World"), modifiers: [SwiftUI.Text.Modifier.color(Optional(orange))])

So far so good. Setting "whatItShouldBe" to the string above got the UI tested, voila. If some refactoring violates the assert I'd be notified and correct either the code or the test.

And, if you wonder, should the "SwiftUI.Text.Storage.verbatim" or smth private like that change (e.g. next OS version or Xcode) insignificantly, I'd just update my test strings (or select a different string based upon Xcode / OS version combination). But now this:

        Text("Hello, " + param)
            .font(.body)

SwiftUI.Font(provider: SwiftUI.(unknown context at $111426b50).FontBox<SwiftUI.Font.(unknown context at $111453a20).TextStyleProvider>)

oops. In order to match that I'd somehow need to strip $111426b50. Which is of course doable, but then how to distinguish "body" vs, say, "largeTitle" or Font made with UIFont?

ok, let's do it properly and override description:

extension SwiftUI.Font: CustomStringConvertible {
    public var description: String {
        "WOW"
    }
}

SwiftUI.Text.Modifier.font(Optional(WOW))

So far so good. But how do I get the "body" / "bold" / "italic" / "customName" + size / "testStyle", etc out of it? There are no getters whatsoever defined on Font....

To make the thing totally opaque might be good from the ideological perspective, yet it is a major spanner in the works here. In order to do the above I have to essentially reimplement the major portions of SwiftUI and use my custom wrappers instead of built-in ones.

Sorry for SwiftUI-ism on this forum. I appreciate this is not exactly the "standard library exposing base" issue, yet it is a similar enough example how the opaqueness of the API works against those of us who want to go just one small step away from the expected general pathway.

PS. in this particular case I personally wouldn't mind using some back door SPI solution to get the job done. If you know a way - please let me know here or privately.

lorentey · May 9, 2022, 8:51pm

One reason I dislike base properties in general is the semantic complexity they introduce to such simple abstractions as slices or trivial transformations.

Case in point: we evidently can't get base to work consistently even in the most fundamental types in the stdlib.

var str = "Hello world"
let i = str.firstIndex(of: " ")!

var s1 = str[i ..< i]
s1.replaceSubrange(s1.endIndex ..< s1.endIndex, with: ", cruel")
print(s1.base) // "Hello, cruel world"

var s2 = str[i ..< i]
s2.append(contentsOf: ", cruel")
print(s2.base) // ", cruel"

(Of course, this inconsistency also surfaces through indices, somewhat damaging the collection abstraction. But at least append is implicitly* documented to invalidate indices. (Not that other parts of the stdlib care much about such minor details as continuing to use invalid indices...))

* (The passage that spells this out is actually missing from the append docs. Yay, yet another bug!)

Substring.base is ill-defined. To make base a general expectation, at minimum I'd like us to provide a specification of how it is supposed to interact with mutations. (Or the expectation strictly limited to @frozen & read-only collection transformations.)

dabrahams · May 15, 2022, 7:03pm

I'm not sure an expectation of consistency is even appropriate in this case. Once a slice is mutated out-of-place (x[i..<j].mutate() would be in-place), the value of its base may be meaningless (as you seem to suggest by the end of your message).

The result of the first print should at least be considered an efficiency bug, however. There's no reason whatsoever to create that long string.

I see the code, but don't know what it's supposed to mean to me in this context. All I get from that is that you sound a little frustrated… so, my sympathies.

lorentey · June 27, 2022, 7:00pm

That is one argument, yes. Another viewpoint is that slices should always preserve their base collection to enable (syntactically) in-place mutations, as in Array.

var numbers = ["one", "three", "four"]
numbers[1 ..< 1].append("two")
print(numbers) // ["one", "two", "three", "four"]

(Thankfully, neither RangeReplaceableCollection nor String provide a setter for their range subscript operation.)

For what it's worth, Slice does preserve the out-of-bounds parts of the original collection, and it goes to great lengths to laboriously recalculate its startIndex and endIndex after a mutation, instead of simply discarding its base.

This is only tangentially related here, but the issue with this piece of Slice is that the linked code assumes that replaceSubrange will not invalidate any indices preceding the mutated range when the collection conforms to both BidirectionalCollection and RangeReplaceableCollection.

RangeReplaceableCollection mutations are documented to invalidate all indices (no ifs and buts), so this assumption isn't valid.

Changing the code to remove the assumption doesn't seem practical at this point, so we'll need to document it and hope that package authors of collection types will notice the warning before they commit to using Slice in their public API. (Which tends to happen implicitly, as Slice is the default SubSequence type.)

The startIndex/endIndex recalculation I mentioned above includes some additional hidden assumptions about the behavior of range-replaceable collection mutations that also do not derive from RangeReplaceableCollection requirements -- such as that inserting a new element will increase the collection's count by one.

Of course, as it happens, String is a range-replaceable bidirectional collection that violates all of these assumptions. It doesn't always preserve indices that precede a mutated range (e.g., mutating a verbatim-bridged string invalidates all indices within), and distance calculations aren't necessarily consistent with expectations before/after a replacement.

var z = "\u{1f9df}\u{2642}\u{fe0f}" // "🧟♂️"
print(z.count) // 2
z.insert("\u{200d}", at: z.index(after: z.startIndex))
print(z.count) // 1
print(z) // "🧟‍♂️"

Unfortunately, until the upcoming 5.7 release*, Substring used to forward most of its operations to Slice, so it fell victim to these assumptions.

(* Note: We currently have a fix for this on the release branch, but as usual, there is no guarantee that it will remain there until the eventual release.)

The tangential relationship to the topic we're currently discussing is that the effect of collection mutations on the value of the base property is closely related to index invalidation. Discarding the sliced-off parts of the original collection during a slice mutation will necessarily invalidate all indices, which works against these hidden assumptions.

GreatApe · June 27, 2022, 9:30pm

Judging from the comments, it seems I'm alone in not understanding exactly what you are proposing. But could you humour me with a concrete example?

In the first post it seems that you want to expose the underlying collection, but later it appeared that it was its type you were after. The latter seems more palatable, and useful, but it would really be great to see an actual use case!

dabrahams · October 28, 2024, 6:43pm

Sorry to resurrect an old thread, but I was directed back here from a link and I realized I never answered this simple question:

So for future reference, this made-up example is a simplification of real ones I've encountered. It at least applies to all mutable adapters: I have a mutating algorithm that would do exactly the right thing to my array, if only the algorithm worked in reverse. I can put a reversed collection adapter around my array and apply the algorithm, but if I can never get the modified array back out of the adapter, I'm stuck.