Sculpting Strings

Sculpting Strings

Hi all, I would like to share some thoughts and examples on formatting strings in Swift and where we could go from here.

I’m mainly focusing on formatting strings for programmer consumption, such as for logging, pretty-printing textual formats, and command line applications. When displaying rich text to a user as part of a UI, it’s best to rely on the platform’s conventions and the UI framework as much as possible.

With the revised approach to string interpolation in Swift 5.0, it is possible to define custom interpolations for common tasks, such as formatting numbers and aligning text. A handful of custom interpolations go a long way for common and simple usage.

Some example custom interpolations that I use frequently:

extension DefaultStringInterpolation {
  public mutating func appendInterpolation<B: BinaryInteger>(
    hex: B, uppercase: Bool = false
  ) {
    appendInterpolation(String(hex, radix: 16, uppercase: uppercase))
  }

  public enum Alignment {
    case left
    case right
    case center
  }

  public mutating func appendInterpolation(
    aligning content: String,
    to align: Alignment,
    columns: Int,
    fill: Character = " "
  ) {
    let segmentLength = content.count
    let fillerCount = columns - segmentLength
    guard fillerCount > 0 else {
      appendInterpolation(content)
      return
    }

    var filler = String(repeating: fill, count: fillerCount)
    let insertIdx: String.Index
    switch align {
    case .left:
      insertIdx = filler.startIndex
    case .right:
      insertIdx = filler.endIndex
    case .center:
      insertIdx = filler.index(filler.startIndex, offsetBy: fillerCount / 2)
    }
    filler.insert(contentsOf: content, at: insertIdx)
    appendInterpolation(filler)
  }

  public enum FloatNotation {
    case decimal
    case exponential
    case scientific
    // ...
  }

  public mutating func appendInterpolation<F: FloatingPoint>(
    precision: Int? = nil,
    notation: FloatNotation = .decimal,
    uppercase: Bool = false,
    _ value: F
  ) {
    // ...
  }
}

Example usage:

print("Address: 0x\(hex: 1234567890, uppercase: true)")
// Address: 0x499602D2

print("""
  Address: 0x\(aligning: "\(hex: 1234567890, uppercase: true)",
               to: .right, columns: 16, fill: "0")
  """)
// Address: 0x00000000499602D2

let values = ["dog", "badger", "cat", "aardvark", "bear", "mouse"]
let columns = values.reduce(0) { max($0, $1.count) }
for value in values {
  print("\(aligning: value, to: .center, columns: columns, fill: "_")")
}
/*
 __dog___
 _badger_
 __cat___
 aardvark
 __bear__
 _mouse__
*/

print("Price: $\(precision: 2, notation: .decimal, 123.456789)")
// Price: $123.45

Programmatic Formatting

One downside of these custom interpolations is that they don’t programmatically compose as cleanly as we would like: we have to introduce a new literal for each level of nesting. Long-term, we want the standard library to provide more building blocks and convenience facilities to enable domain-specific formatting and pretty-printing.

Pretty-printing nested structures involves tracking indentation by passing around context, e.g. Pretty Printing HTML. We’d want to provide composable types, interfaces, and idioms for the programmatic building, manipulation, and display of formatted data. For an (operator/combinator-heavy) example of generalized pretty-printing, see DoctorPretty, a Swift port of Wadler’s “A prettier printer”.

ExpressibleByStringInterpolation in Swift 5.0 enhances programmatic formatting: types that conform have convenient interpolation syntax and custom interpolations. Formatting information can be accumulated during interpolation, programmatically queried/modified, and later applied during rendering. See here for an example of how powerful the revised ExpressibleByStringInterpolation can be when applied to efficient logging.

This is a fairly open-ended design space and we want to see how libraries make use of ExpressibleByStringInterpolation. For now, we can focus on the current frustrations and pain-points, which revolve around things like simple numeric formatting and padding.

Standard Extensible Format Strings

A format string specifies how to interpolate values of certain types, but can be declared independent of the actual values to be substituted in. They allow individual values to be provided elsewhere in code (even partially), and the actual insertion of those values can be deferred until render time. For example, osLog uses an extension of NSString’s format string, which in turn is compatible with IEEE’s fprintf. This format string exists in the binary, and is applied lazily to multiple payloads throughout the execution of the program. This forum post shows how a custom interpolation conformance can statically construct such a format string from a series of interpolation calls.

We could benefit from a standard format string type for Swift, statically-constructible through its custom conformance to ExpressibleByStringInterpolation. This type is analogous to an unapplied string interpolation, and can be a common interchange format for systems that want programmatic or deferred formatting. We should consider leveraging an existing standard, such as used by fprintf.

Can anyone think of an immediate advantage or pressing need that would prioritize this?

Unifying Interpolations with String Initializers (and/or Functions Returning String)

There is potential overlap between a String initializer taking a type, a custom interpolation taking a type, and a computed String variable on that type. Figuring out where new functionally naturally fits will require some experience with these features.

16 Likes

-1 for format strings. Let's stick to something the compiler can verify rather than inventing something new for the compiler to verify. (EDIT: But +1 for discussing what to do in this space that's better!)

4 Likes

We could benefit from a standard format string type for Swift, statically-constructible through its custom conformance to ExpressibleByStringInterpolation. This type is analogous to an unapplied string interpolation, and can be a common interchange format for systems that want programmatic or deferred formatting. We should consider leveraging an existing standard, such as used by fprintf .

@Michael_Ilseman Is what you are suggesting a type conforming to ExpressibleStringInterpolation that will construct a format string behind the scenes from a string interpolation, and expose them through some properties? E.g. something like

   struct FormatString : ExpressibleByStringInterpolation {
       func appendLiteral(...) { ... }
       func appendInterpolation() { ... }
       public var formatString = { ... } 
       public var args = ...
    }

So Swift users can do x: FormatString = "Number \(x)" and read from x.formatString and obtain "Number %d".

Definitely. Would there be interesting overlap with the logging effort? Anything useful to generalize from your view?

This is all pretty hand-wavy right now.

The goal would be for the ExpressibleByStringInterpolation conformer to create valid format strings by construction and expose an API to bind it to a set of parameters to produce a String. This binding could be dynamic.

The conformer may keep internal state so it can do runtime validation, but would also be able to produce a fprintf-style format string for interoperability. I think this would be a generalization of @ravikandhadai’s work and we’d leverage his experience.

Yes absolutely. Deconstructing a string interpolation into a static format string and a sequence of (dynamic) arguments is mostly what the new os_log APIs and the planned compiler optimizations for it are aiming to achieve. This is certainly generalizable.

Just to clarify, as you know, if the format string is not required to be static it requires no new compiler support. (In fact, an experimental implementation for os_log format strings is already available here). This may in itself could be quite useful for some applications. If the format string must be constructed statically, as in the os_log APIs, it requires additional compiler support, which is being developed as part of that effort.