SE-0228: Fix Expressible By String Interpolation

Douglas_Gregor · September 10, 2018, 5:17pm

The review of SE-0228: Fix ExpressibleByStringInterpolation begins now and runs through Tuesday, September 18th, 2018.

Reviews are an important part of the Swift evolution process. All review feedback should be either on this forum thread or, if you would like to keep your feedback private, directly to me as the review manager via email or direct message on the forums. If you send me email, please put "SE-0228" somewhere in the subject line.

What goes into a review of a proposal?

The goal of the review process is to improve the proposal under review through constructive criticism and, eventually, determine the direction of Swift.

When reviewing a proposal, here are some questions to consider:

What is your evaluation of the proposal?
Is the problem being addressed significant enough to warrant a change to Swift?
Does this proposal fit well with the feel and direction of Swift?
If you have used other languages or libraries with a similar feature, how do you feel that this proposal compares to those?
How much effort did you put into your review? A glance, a quick reading, or an in-depth study?

Thank you for contributing to Swift!

Doug Gregor
Review Manager

nuclearace · September 10, 2018, 6:15pm

Is there any concern that the StringLiteralType associatedtype requirement might collide with the existing magic StringLiteralType typealias that is used by the compiler to determine the type of string literals? I know that naive use of typealias *LiteralType can cause bizarre errors due to not being able to control scoping/lookup of the magic *LiteralTypes.

beccadax · September 10, 2018, 6:51pm

ExpressibleByStringLiteral has an associated type by this name; in fact, all of the literal protocols have associated types with the same name as their magic typealiases. I can’t say I’ve specifically investigated this, but I’d be very surprised if it became a problem.

nuclearace · September 10, 2018, 7:01pm

Yeah, I would be surprised as well. IIRC the magic ones are only looked up in the module scope. Although I did tinker around a bit with trying to allow custom scoping, but it ran into a few issues.

Overall I think this is a great fixup of this ExpressibleBy protocol. I've reached for the current one in the past but gave up trying to use it because of it's deprecatedness and general inflexibility. This is a much more useable version.

jawbroken · September 11, 2018, 2:28am

Very nice proposal, and I particularly like the examples of possible conforming types.

What is your evaluation of the proposal?

I'm not familiar enough with this area to judge the technical details, but the user-facing parts seem flexible and well designed. I think this forms a good basis for future improvements to Swift (e.g. number formatting) and opens up interesting possibilities for developers to make their own ergonomic string-like types.

Is the problem being addressed significant enough to warrant a change to Swift?

Yes, this work keeps the promise made when ExpressibleByStringInterpolation was deprecated.

Does this proposal fit well with the feel and direction of Swift?

Yes, almost by definition, as it allows the standard Swift string interpolation syntax to be used more widely.

If you have used other languages or libraries with a similar feature, how do you feel that this proposal compares to those?

N/A

How much effort did you put into your review? A glance, a quick reading, or an in-depth study?

Read all the previous discussions and pitches, and the final proposal.

jrose · September 11, 2018, 4:31pm

This is great work by both Brent and Michael; I know Brent in particular has been working on this for a long time. I've long wanted to use string interpolations to handle localized strings and this design makes it easy.

That said…I do have some concerns / suggestions.

public protocol StringInterpolationProtocol {
 associatedtype StringLiteralType: _ExpressibleByBuiltinStringLiteral

I feel like this rule is overly restrictive. It seems perfectly reasonable to me to want my string interpolation segments to be built on some type other than String or StaticString, the only _ExpressibleByBuiltinStringLiteral types in the language. With ExpressibleByStringLiteral we made this rule to prevent arbitrary regress, but allowing StringInterpolationProtocol to have one more conversion in it doesn't seem unreasonable. So what happens if we loosen that constraint?

public protocol StringInterpolationProtocol {
 associatedtype StringLiteralType: ExpressibleByStringLiteral

public protocol ExpressibleByStringInterpolation
  : ExpressibleByStringLiteral {
  associatedtype StringInterpolation: StringInterpolationProtocol
    = String.StringInterpolation
    where StringInterpolation.StringLiteralType == StringLiteralType

This constraint in the main protocol no longer quite makes sense. What would we lose if we delete it?

The standard library will also provide default implementations which forward the interpolated string to init(stringLiteral:). The end result is that a type which already supports string literals can support interpolations without any special semantics by simply replacing ExpressibleByStringLiteral with ExpressibleByStringInterpolation.

Hm. Maybe that default implementation can be constrained on some condition, instead of the associated type. Normally, though, I'd want the inference to go the other way: implement the specific case in terms of the general one. That would give us:

extension ExpressibleByStringInterpolation
  where StringLiteralType == StringInterpolation.StringLiteralType {
  init(stringLiteral: StringLiteralType) {
    var singleInterpolation = StringInterpolation(literalCapacity: ???, interpolationCount: 0)
    singleInterpolation.appendLiteral(stringLiteral)
    self.init(stringInterpolation: stringInterpolation)
  }
}

extension ExpressibleByStringInterpolation
  where StringLiteralType == String, StringInterpolation == String.StringInterpolation {
  init(stringInterpolation: StringInterpolation) {
    self.init(stringLiteral: stringInterpolation.make())
  }
}

There is one missing piece there of how to get the capacity out of the existing string literal. Maybe compiler magic can handle this instead of a stdlib implementation, but the point is that if I do have custom interpolation I don't also want to implement init(stringLiteral:). That seems more important to me than the other way around.

jrose · September 11, 2018, 10:19pm

After thinking about this a bit more, maybe breaking the associated type restriction doesn't matter so much. I do still want a default implementation of init(stringLiteral:), though.

davbeck · September 12, 2018, 5:04am

Big thumbs up from me. I think this will be a really useful addition to the language. As I said in an earlier discussion, I was able to adapt this to my Postgres library with very little effort (sans actual compiler integration) and it adds a ton of type safety there. Allowing named parameters in particular adds that extra level of swiftyness.

That being said, there are going to be dozens of different NSAttributesString implementations using this in weird and “creative” ways.

beccadax · September 12, 2018, 6:09am

jrose:

Hm. Maybe that default implementation can be constrained on some condition, instead of the associated type. Normally, though, I'd want the inference to go the other way: implement the specific case in terms of the general one. That would give us:
extension ExpressibleByStringInterpolation
  where StringLiteralType == StringInterpolation.StringLiteralType {
  init(stringLiteral: StringLiteralType) {
    var singleInterpolation = StringInterpolation(literalCapacity: ???, interpolationCount: 0)
    singleInterpolation.appendLiteral(stringLiteral)
    self.init(stringInterpolation: stringInterpolation)
  }
}

extension ExpressibleByStringInterpolation
  where StringLiteralType == String, StringInterpolation == String.StringInterpolation {
  init(stringInterpolation: StringInterpolation) {
    self.init(stringLiteral: stringInterpolation.make())
  }
}
There is one missing piece there of how to get the capacity out of the existing string literal. Maybe compiler magic can handle this instead of a stdlib implementation, but the point is that if I do have custom interpolation I don't also want to implement init(stringLiteral:) . That seems more important to me than the other way around.

I did a quick check in the REPL (and the source) and it looks like String.append(_:) is equivalent to a simple assignment if it's done with an empty string and reserveCapacity(_:) has either not been called, or has been called with 0. So this implementation should not have much, if any, overhead when StringLiteralType == String:

extension ExpressibleByStringInterpolation {
  public init(stringLiteral value: StringLiteralType) {
    var interpolation = StringInterpolation(literalCapacity: 0, interpolationCount: 0)
    interpolation.appendLiteral(value)
    self.init(stringInterpolation: interpolation)
  }
}

What makes me hesitate to do this is that, if we provided both that and the init(stringInterpolation: DefaultStringInterpolation) included in the proposal, the two would be mutually recursive. And the default init(stringInterpolation:) implementation is incredibly convenient—if you can use it, you just don't need to worry about how string interpolation works at all.

I seem to recall that we put in a mutually-recursive-defaults check for some other type (maybe Hashable?), and I guess we could do something like that here, but that seems like too much work for a nonessential convenience.

Alejandro_Martinez · September 12, 2018, 11:59am

+1

What is your evaluation of the proposal?

Really positive. The current interpolation system is enough for basic strings usage but we need something more powerful if we want to use it with better type safety.

Is the problem being addressed significant enough to warrant a change to Swift?

I think so. This will open a world of better APIs in Swift.

Does this proposal fit well with the feel and direction of Swift?

Absolutely. Relying more on the type system and adding extra safety with the convenience of string interpolation is really great for Swift.

If you have used other languages or libraries with a similar feature, how do you feel that this proposal compares to those?

Other languages interpolations indeed help by keeping the information of the interpolation segment so you can do custom things with it. The proposal offers that in the form of appendInterpolation overloads.

How much effort did you put into your review? A glance, a quick reading, or an in-depth study?

I've been following the proposal since its inception in the forums and I felt the need for this improvement while I was playing with Solving the Strings Problem in Swift some time ago.

I'm not an expert on the performance aspect of the proposal so I can't comment on that, there is other people on the thread talking about it already ^^

Thanks :)

jrose · September 12, 2018, 4:35pm

Hm. We could trick this into working by making the StringLiteralType of String be a dummy wrapper type instead of itself (because then the StringLiteralType won't match DefaultStringInterpolation.StringLiteralType), but then that goes into the ABI for all time. It's just bizarre to me that I'd go through all the trouble of implementing string interpolation and I wouldn't be able to treat "foo" as a degenerate interpolated literal.

beccadax · September 13, 2018, 11:52am

So you don't think I'm ignoring this: This made me think of an interesting alternative design where StringInterpolationProtocol conforms to _BuiltinExpressibleByStringLiteral and all literals expressing an interpolatable type, even single-segment ones, pass through it. So far it looks clever but flawed, but I don't feel that I've quite gotten far enough yet to rule it out. I'll probably have more to say later.

zwaldowski · September 14, 2018, 5:39am

What is your evaluation of the proposal?

I am fanatically in favor of this proposal.

My only doubt is that it requires a compiler hack to model the way appendLiteral(_:) works. That's not really a change to the status quo, and such a modeling is definitely outside the scope of an otherwise slam-dunk proposal.

Is the problem being addressed significant enough to warrant a change to Swift?

The design this proposal intends to replace is known to be deficient, and should (must? should? please?) be addressed in time for ABI stability. This is not only a superlative design, but it's the design we need.

Does this proposal fit well with the feel and direction of Swift?

100%. It doesn't add any surface area to the language itself, but makes it orders of magnitude more useful in certain domains. It's a win-win.

If you have used other languages or libraries with a similar feature, how do you feel that this proposal compares to those?

N/A.

How much effort did you put into your review? A glance, a quick reading, or an in-depth study?

I've followed this proposal since its beginning.

Karl · September 14, 2018, 6:46am

A couple of points:

(meta-points)

The proposal seems to have some inconsistent gist links (e.g. https://gist.github.com/brentdax/79fa038c0af0cafb52dd seems to use a much older version of the proposed protocol, even though a later link points to an updated version). I saw that code after reading the proposal and had no idea what was going on. I thought I must have missed a huge chunk.
The more-accurate gist (https://gist.github.com/brentdax/0b46ce25b7da1049e61b4669352094b6) appears to have a number of mistakes. For example, HTML.append(HTML) appends the raw strings, since it is pre-sanitized but HTML.StringInterpolation.append(HTML) doesn't, StringInterpolationBufferProtocol vs StringInterpolationProtocol, LocalizableStringValue vs LocalizedStringInterpolable, example conformances for SQL but not LocalizedString. Of course it isn't "real code", but it's more difficult to follow and I had to read each of them several times before I was sure there weren't intentional details I was missing.

(questions/comments after reading a couple of times):

I don't think the name StringInterpolationProtocol conforms to the API design guidelines. Sure, StringProtocol doesn't either, but we shouldn't make the problem worse.
Not sure I like the name StringInterpolation as the associated type. Builder or something to that effect seems more appropriate, given its intention as a scratchpad. It also seems somewhat analogous to the use of StringBuilder in C# and Java, so there may be a term-of-art argument.
Even though appendInterpolation can take any type (and have various overloads), most of the examples still seem to have a sister-protocol (e.g. LocalizableStringInterpolable, SQLiteStatementConvertible) and are trivially-rewritable in terms of protocols. The call-site would perhaps be not quite as streamlined, but it also would be more discoverable and wouldn't require a magic, one-off language feature. The 'alternatives considered' doesn't really explore this in as much depth as I would assume necessary.

Overall, I like the proposed feature, but I also think it deserves an appropriate level of scrutiny.

beccadax · September 14, 2018, 10:40am

tl;dr: It looks like designs which route ordinary literals through StringInterpolationProtocol can significantly compromise the performance of non-interpolated string literals.

So, I threw together an alternative design. This probably would change if developed further—in fact, I think ExpressibleByStringInterpolation might be eliminated entirely, and a lot of names would shift around—but I think the prototype is the right "shape" to draw some conclusions.

The essential difference in this design is that, instead of having separate StringLiteralType/init(stringLiteral:) and StringInterpolation/init(stringInterpolation:) associated types/initializers, the StringLiteralType of an interpolation-supporting type conforms to StringInterpolationProtocol. In theory this means that all literal uses on a conforming type, even ones with no interpolation, are handled through StringInterpolationProtocol; for instance, initializing a String would always involve a DefaultStringInterpolation instance. In practice, though, Swift sidesteps this for String by using the builtin protocols.

In the prototype, conditional conformance's rules limit this design somewhat. In order to be allowed as a StringLiteralType, StringInterpolationProtocol conforms to _ExpressibleByBuiltinStringLiteral. We would want it to conditionally conform to _ExpressibleByBuiltinUTF16StringLiteral too, but you can't make a protocol conditionally conform. A deeper redesign might be able to address this issue by reshuffling some of the protocols around.

I benchmarked this alternative to compare it with the proposal's design (results: proposal, alternative). The results were pretty meh; the alternative was generally slightly worse, but optimization work might close that gap.

However, our benchmarks all use String, and String probably bypasses the slower parts of this alternative, so I also wrote two new benchmarks. Both of them operate on a simple type which uses DefaultStringInterpolation:

struct CustomString: ExpressibleByStringInterpolation {
	var value: String
	
	// Initializer is slightly different in the alternative design.
	init(stringLiteral: String) {
		self.value = stringLiteral
	}
}

The first one is basically just the StringInterpolation benchmark, but constructing a CustomString instead of a String. The results are as muddled as the interpolation benchmarks on normal Strings:

-O: 10,721 vs. 10,594; -Onone: 13,052 vs. 13,395; -Osize: 10,580 vs. 11,080

But the other new benchmark is a different story. This one has no interpolations at all—that is, it's a benchmark of custom types expressed as non-interpolated string literals. And there we see a pretty heavy penalty for the alternative design:

-O: 167 vs. 414; -Onone: 1,007 vs. 1,433; -Osize: 179 vs. 407

So routing ordinary literals through the interpolation machinery seems to be a bad idea; designs that do this might have elegant properties, but they may also have a significant performance penalty for non-interpolated string literals.

I suspect this lesson would also apply to making init(stringLiteral:) forward to init(stringInterpolation:), but I can test that if you'd like.

jrose · September 14, 2018, 5:41pm

Hm. We'd have to allow StringLiteralType to be an interpolation in that case, which could lead to arbitrary nesting. (In your prototype, DefaultStringInterpolation is _ExpressibleByBuiltinStringLiteral, but obviously that doesn't scale.)

And we certainly don't want ExpressibleByStringInterpolation to not refine ExpressibleByStringLiteral, in case anyone is writing code that's generic over ExpressibleByStringLiteral (and probably some other constraints too).

I guess I really do want a default implementation of init(stringLiteral:), but only if you're not using the default implementation of init(stringInterpolation:). And there's no clean way to write that today. (A dirty way is with a constrained implementation that's deprecated, so that it provides a better match but then warns.)

beccadax · September 14, 2018, 10:23pm

The links early in the "Motivation" section are showing examples using the current (deprecated) design; the gist link later on, in the " Proposed solution" section, shows similar types reimplemented with the proposal's design. I'm sorry that the difference wasn't clear.

I'm pretty sure this works correctly—HTML.StringInterpolation.appendInterpolation(_: HTML) calls through to HTML.append(_: HTML), which knows that the strings inside the HTML instance are raw. Maybe I'm not seeing your point, though.

The other issues you identified are true; I've done another pass through the gist to correct those and a couple of other small issues. If you notice any other problems, please let me know.

This is a common convention we use when the name of a protocol conflicts with the name of a concrete type conforming to the protocol. For instance, Sequence.Iterator conforms to IteratorProtocol; LazySequence conforms to LazySequenceProtocol; KeyedDecodingContainer<K> conforms to KeyedDecodingContainerProtocol; NSObject conforms to NSObjectProtocol; SCNAnimation conforms to SCNAnimationProtocol. If that convention is omitted from the guidelines, it was either considered unimportant or it was a mistake.

I vacillated on this name too; at various points it was StringInterpolationType (to match StringLiteralType), StringInterpolationBuffer, StringInterpolator, etc. I ultimately decided that an instance represents the interpolation itself, so it should be called a "string interpolation". But if the core team doesn't like these names, I would not oppose changing them—I just don't want us to bikeshed a proposal that's centered on a fairly complex, performance-sensitive design with a lot of tradeoffs to consider.

Many do, but the most important—DefaultStringInterpolation, the type used by String—doesn't. Even if it did, the type system doesn't support requirements like this:

associatedtype InterpolatableType: protocol
mutating func appendInterpolation<T: InterpolatableType>(_ value: T)

And even if it did support that, there might be use cases for types where all interpolations require a parameter label (e.g. logging system where everything has to be \(public: value) or \(private: value)) or additional parameters (e.g. alternate design where you use \(value, .public) or \(value, .private)).

This section went much more in depth on this at one point, but we cut it because the proposal was already very long. Suffice it to say, we decided that any formal requirement we could think of would restrict some sort of plausible design, and you can always write a refined StringInterpolationProtocol to improve generic behavior—something like:

protocol UnconstrainedStringInterpolationProtocol:
    StringInterpolationProtocol {
  mutating func appendInterpolation<T>(_ value: T)
}
extension DefaultStringInterpolation:
    UnconstrainedStringInterpolationProtocol {}
func makeString<S: ExpressibleByStringInterpolation>
    (of type: S.Type, with value: Int) -> S
    where S.StringInterpolation: UnconstrainedStringInterpolationProtocol {
  return "Made string with \(value)"
}

beccadax · September 14, 2018, 11:20pm

A somewhat less dirty way is to make the init(stringLiteral:) unavailable instead of deprecated; that way, you get a hard error. That's implemented here, along with tests proving that it works.

I ran benchmarks in an alternative PR (results: proposal, alternative), and they were nearly identical to the charts above. But here the (substantial—up to 2.4x when you're just storing a String) overhead is something you could avoid by manually implementing init(stringLiteral:) alongside your init(stringInterpolation:).

I don't know if the core team would consider a slow but convenient default implementation to be a good addition to the standard library, but I don't think it's unreasonable to decide it is. If we do add it, we should clearly document that init(stringLiteral:) should be implemented manually for performance-sensitive types.

jrose · September 14, 2018, 11:33pm

Huh, I would have expected "unavailable" to mean that the conformance checker would skip it, since that's what happens with overloads.

I'm confused as to why the overhead is so high. Surely the whole thing can be inlined down to

init(stringLiteral: String) {
  var interpolation_string = String(reservingCapacity: 0)
  interpolation_string += stringLiteral // should dynamically be a replacement
  self.value = interpolation_string
}

…oh, maybe that initial String.init(reservingCapacity:) isn't being short-circuited?

beccadax · September 14, 2018, 11:44pm

reserveCapacity(0) was a no-op in the REPL, but maybe there’s something going on here that prevents this. I’ll see what I can see when I’m at a computer again.